Efficient and Robust Automated Machine Learning
Authors: Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum, Frank Hutter
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our system won the first phase of the ongoing Cha Learn Auto ML challenge, and our comprehensive analysis on over 100 diverse datasets shows that it substantially outperforms the previous state of the art in Auto ML. We also demonstrate the performance gains due to each of our contributions and derive insights into the effectiveness of the individual components of AUTO-SKLEARN. |
| Researcher Affiliation | Academia | Matthias Feurer Aaron Klein Katharina Eggensperger Jost Tobias Springenberg Manuel Blum Frank Hutter Department of Computer Science University of Freiburg, Germany {feurerm,kleinaa,eggenspk,springj,mblum,fh}@cs.uni-freiburg.de |
| Pseudocode | Yes | Procedure 1 in the supplementary material describes it in detail. |
| Open Source Code | Yes | The source code of AUTO-SKLEARN is available under an open source license at https://github.com/automl/auto-sklearn. |
| Open Datasets | Yes | In an offline phase, for each machine learning dataset in a dataset repository (in our case 140 datasets from the Open ML [18] repository)... |
| Dataset Splits | Yes | Further, let Dtrain = {(x1, y1), . . . , (xn, yn)} be a training set which is split into K cross-validation folds {D(1) valid, . . . , D(K) valid} and {D(1) train, . . . , D(K) train} such that D(i) train = Dtrain\D(i) valid for i = 1, . . . , K. |
| Hardware Specification | No | The paper mentions 'CPU and/or wallclock time' for computational budget and '10.7 CPU years' for total experiment time, but it does not specify any particular CPU or GPU models, memory, or other hardware components used. |
| Software Dependencies | No | The paper mentions software frameworks like 'scikit-learn [7]', 'WEKA [8]', 'SMAC [9]', and 'Open ML [18]' but does not provide specific version numbers for these software dependencies as required for reproducibility. |
| Experiment Setup | Yes | To study their performance under rigid time constraints, and also due to computational resource constraints, we limited the CPU time for each run to 1 hour; we also limited the runtime for a single model to a tenth of this (6 minutes). |