Initializing Bayesian Hyperparameter Optimization via Meta-Learning
Authors: Matthias Feurer, Jost Springenberg, Frank Hutter
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To validate our approach, we perform extensive experiments with two established SMBO frameworks (Spearmint and SMAC) with complementary strengths; optimizing two machine learning frameworks on 57 datasets. |
| Researcher Affiliation | Academia | Matthias Feurer and Jost Tobias Springenberg and Frank Hutter {feurerm,springj,fh}@cs.uni-freiburg.de Computer Science Department, University of Freiburg Georges-K ohler-Allee 52 79110 Freiburg, Germany |
| Pseudocode | Yes | Algorithm 1: Generic Sequential Model-based Optimization. SMBO(f D, T, Θ, θ1:t) |
| Open Source Code | No | The paper mentions supplementary material for more results, but does not state that source code for the described methodology is publicly available. The text "for more results, please see the supplementary material: www.automl.org/aaai2015-mi-smbo-supplementary.pdf" refers to results, not code. |
| Open Datasets | Yes | We found the Open ML project (Vanschoren et al. 2013) to be the best source of datasets and used the 60 classification datasets it contained in April 2014. |
| Dataset Splits | Yes | We first shuffled each dataset and then split it in stratified fashion into 2/3 training and 1/3 test data. Then, we computed the validation performance for Bayesian optimization by ten-fold crossvalidation on the training dataset. |
| Hardware Specification | No | The paper mentions that calculating the grid took up to three days per dataset "on a modern CPU" but provides no specific hardware details (e.g., CPU model, GPU, memory). |
| Software Dependencies | No | The paper mentions the use of "scikit-learn package (Pedregosa et al. 2011)" and "WEKA package (Hall et al. 2009)" but does not specify version numbers for these or any other software components. |
| Experiment Setup | Yes | To keep the computation bearable and the results interpretable, we only included three classification algorithms: an SVM with an RBF kernel, a linear SVM, and random forests. Since we expected noise and redundancies in the training data, we also allowed the optimization procedure to use Principal Component Analysis (PCA) for preprocessing; with the number of PCA components being conditional on PCA being applied. In total this lead to 10 hyperparameters, as detailed in Table 2. |