reproducibilityindex.ai

Optimizer Benchmarking Needs to Account for Hyperparameter Tuning

Authors: Prabhu Teja Sivaprasad, Florian Mai, Thijs Vogels, Martin Jaggi, François Fleuret

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Evaluating a variety of optimizers on an extensive set of standard datasets and architectures, our results indicate that Adam is the most practical solution, particularly in low-budget scenarios.
Researcher Affiliation	Academia	1Idiap Research Institute, Switzerland 2EPFL, Switzerland 3University of Geneva, Switzerland.
Pseudocode	Yes	Procedure 1 Benchmark with expected quality at budget
Open Source Code	No	The paper does not contain an explicit statement about the release of its own source code or a link to a code repository for its methodology.
Open Datasets	Yes	The architectures and datasets we experiment with are given in Table 3. ... FMNIST, CIFAR10/100, MNIST, SVHN, IMDB and Tolstoi s War and Peace
Dataset Splits	Yes	We refer the reader to Schneider et al. (2019) for speciﬁc details of the architectures. ... Thus we stop training when the validation loss plateaus for more than 2 epochs or if the number of epochs exceeds the predetermined maximum number as set in DEEPOBS.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'Py Torch' but does not specify its version number or any other software dependencies with their versions.
Experiment Setup	Yes	We use Random Search on a large range of admissible values on each task speciﬁed in DEEPOBS to obtain an initial set of results. We then retain the hyperparameters which resulted in performance within 20% of the best result obtained. For each of the hyperparameters in this set, we ﬁt the distributions in the third column of Table 2 using maximum likelihood estimation. ... Thus we stop training when the validation loss plateaus for more than 2 epochs or if the number of epochs exceeds the predetermined maximum number as set in DEEPOBS.