Optimizer Benchmarking Needs to Account for Hyperparameter Tuning

Authors: Prabhu Teja Sivaprasad, Florian Mai, Thijs Vogels, Martin Jaggi, François Fleuret

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Evaluating a variety of optimizers on an extensive set of standard datasets and architectures, our results indicate that Adam is the most practical solution, particularly in low-budget scenarios.
Researcher Affiliation Academia 1Idiap Research Institute, Switzerland 2EPFL, Switzerland 3University of Geneva, Switzerland.
Pseudocode Yes Procedure 1 Benchmark with expected quality at budget
Open Source Code No The paper does not contain an explicit statement about the release of its own source code or a link to a code repository for its methodology.
Open Datasets Yes The architectures and datasets we experiment with are given in Table 3. ... FMNIST, CIFAR10/100, MNIST, SVHN, IMDB and Tolstoi s War and Peace
Dataset Splits Yes We refer the reader to Schneider et al. (2019) for specific details of the architectures. ... Thus we stop training when the validation loss plateaus for more than 2 epochs or if the number of epochs exceeds the predetermined maximum number as set in DEEPOBS.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'Py Torch' but does not specify its version number or any other software dependencies with their versions.
Experiment Setup Yes We use Random Search on a large range of admissible values on each task specified in DEEPOBS to obtain an initial set of results. We then retain the hyperparameters which resulted in performance within 20% of the best result obtained. For each of the hyperparameters in this set, we fit the distributions in the third column of Table 2 using maximum likelihood estimation. ... Thus we stop training when the validation loss plateaus for more than 2 epochs or if the number of epochs exceeds the predetermined maximum number as set in DEEPOBS.