Optimizer Benchmarking Needs to Account for Hyperparameter Tuning
Authors: Prabhu Teja Sivaprasad, Florian Mai, Thijs Vogels, Martin Jaggi, François Fleuret
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluating a variety of optimizers on an extensive set of standard datasets and architectures, our results indicate that Adam is the most practical solution, particularly in low-budget scenarios. |
| Researcher Affiliation | Academia | 1Idiap Research Institute, Switzerland 2EPFL, Switzerland 3University of Geneva, Switzerland. |
| Pseudocode | Yes | Procedure 1 Benchmark with expected quality at budget |
| Open Source Code | No | The paper does not contain an explicit statement about the release of its own source code or a link to a code repository for its methodology. |
| Open Datasets | Yes | The architectures and datasets we experiment with are given in Table 3. ... FMNIST, CIFAR10/100, MNIST, SVHN, IMDB and Tolstoi s War and Peace |
| Dataset Splits | Yes | We refer the reader to Schneider et al. (2019) for specific details of the architectures. ... Thus we stop training when the validation loss plateaus for more than 2 epochs or if the number of epochs exceeds the predetermined maximum number as set in DEEPOBS. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify its version number or any other software dependencies with their versions. |
| Experiment Setup | Yes | We use Random Search on a large range of admissible values on each task specified in DEEPOBS to obtain an initial set of results. We then retain the hyperparameters which resulted in performance within 20% of the best result obtained. For each of the hyperparameters in this set, we fit the distributions in the third column of Table 2 using maximum likelihood estimation. ... Thus we stop training when the validation loss plateaus for more than 2 epochs or if the number of epochs exceeds the predetermined maximum number as set in DEEPOBS. |