Efficient Benchmarking of Hyperparameter Optimizers via Surrogates
Authors: Katharina Eggensperger, Frank Hutter, Holger Hoos, Kevin Leyton-Brown
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated a wide range of regression techniques, both in terms of how well they predict the performance of new hyperparameter settings and in terms of the quality of surrogate benchmarks obtained. |
| Researcher Affiliation | Academia | Katharina Eggensperger and Frank Hutter University of Freiburg {eggenspk, fh}@cs.uni-freiburg.de Holger H. Hoos and Kevin Leyton-Brown University of British Columbia {hoos, kevinlb}@cs.ubc.ca |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | Our final surrogate benchmarks are freely available online at www.automl.org/benchmarks.html. This link provides access to the benchmarks (trained models and data) but not explicitly the source code used for the methodology (training of the regression models). |
| Open Datasets | Yes | We experimented with nine benchmarks from the hyperparameter optimization benchmark library HPOLIB (Eggensperger et al. 2013), including three low-dimensional and six high-dimensional hyperparameter spaces. The low-dimensional benchmarks were derived from a logistic regression (Snoek, Larochelle, and Adams 2012) with 4 hyperparameters on the MNIST dataset (Le Cun et al. 1998)... |
| Dataset Splits | Yes | We experimented with nine benchmarks from the hyperparameter optimization benchmark library HPOLIB (Eggensperger et al. 2013), including three low-dimensional and six high-dimensional hyperparameter spaces. The low-dimensional benchmarks were derived from a logistic regression (Snoek, Larochelle, and Adams 2012) with 4 hyperparameters on the MNIST dataset (Le Cun et al. 1998) (both with and without 5-fold cross validation)... |
| Hardware Specification | Yes | The evaluation of a single configuration of the logistic regression required roughly 1 minute on a single core of an Intel Xeon E5-2650 v2 CPU... To run efficiently, the HP-DBNET required a GPGPU; on a modern Geforce GTX780 GPU, it took roughly 15 minutes to evaluate a single configuration. |
| Software Dependencies | Yes | As a baseline, we also experimented with k-nearest-neighbours (k NN), linear regression, ridge regression, and two SVM methods (all as implemented by scikit-learn, version 0.15.1 (Pedregosa et al. 2011)). |
| Experiment Setup | Yes | We experimented with nine benchmarks from the hyperparameter optimization benchmark library HPOLIB (Eggensperger et al. 2013)... For each benchmark, we executed 10 runs each of SMAC, SPEARMINT, TPE and random search (using the Hyperopt implementation of both random search and TPE)... We used random search to optimize hyperparameters and considered 100 samples over the stated hyperparameters; we trained the model on 50% of the data, chose the best configuration based on its performance on the other 50%, and then trained on all data. |