Neural networks have shown great success in everything from playing Go and Atari games to image recognition and language translation. But often overlooked is that the success of a neural network at a particular application is often determined by a series of choices made at the start of the research, including what type of network to use and the data and method used to train it. Currently, these choices - known as hyperparameters - are chosen through experience, random search or a computationally intensive search processes.
In our most recent paper, we introduce a new method for training neural networks which allows an experimenter to quickly choose the best set of hyperparameters and model for the task. This technique - known as Population Based Training (PBT) - trains and optimises a series of networks at the same time, allowing the optimal set-up to be quickly found. Crucially, this adds no computational overhead, can be done as quickly as traditional techniques and is easy to integrate into existing machine learning pipelines.
The technique is a hybrid of the two most commonly used methods for hyperparameter optimisation: random search and hand-tuning. In random search, a population of neural networks are trained independently in parallel and at the end of training the highest performing model is selected. Typically, this means that a small fraction of the population will be trained with good hyperparameters, but many more will be trained with bad ones, wasting computer resources