Hyperparameter (machine learning)
Different model training algorithms require different hyperparameters, some simple algorithms (such as ordinary least squares regression) require none. Given these hyperparameters, the training algorithm learns the parameters from the data. For instance, LASSO is an algorithm that adds a regularization hyperparameter to ordinary least squares regression, which has to be set before estimating the parameters through the training algorithm.
The time required to train and test a model can depend upon the choice of its hyperparameters. An inherent stochasticity in learning directly implies that the empirical hyperparameter performance is not necessarily its true performance. A hyperparameter is usually of continuous or integer type, leading to mixed-type optimization problems. The existence of some hyperparameters is conditional upon the value of others, e.g. the size of each hidden layer in a neural network can be conditional upon the number of layers.
Most performance variation can be attributed to just a few hyperparameters. For an LSTM, while the learning rate followed by the network size are its most crucial hyperparameters, others (namely, batching and momentum) have no significant effect on its performance.
Hyperparameter optimization finds a tuple of hyperparameters that yields an optimal model which minimizes a predefined loss function on given test data. The objective function takes a tuple of hyperparameters and returns the associated loss.
- "Claesen, Marc, and Bart De Moor. "Hyperparameter Search in Machine Learning." arXiv preprint arXiv:1502.02127 (2015)".
- "Hutter, Frank, Holger Hoos, and Kevin Leyton-Brown. "An efficient approach for assessing hyperparameter importance." International Conference on Machine Learning. 2014".
- "van Rijn, Jan N., and Frank Hutter. "Hyperparameter Importance Across Datasets." arXiv preprint arXiv:1710.04725 (2017)".
- "Greff, Klaus, et al. "LSTM: A search space odyssey." IEEE transactions on neural networks and learning systems (2017)".
- "Breuel, Thomas M. "Benchmarking of LSTM networks." arXiv preprint arXiv:1508.02774 (2015)".