reproducibilityindex.ai

Efficient and Robust Automated Machine Learning

Authors: Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum, Frank Hutter

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our system won the ﬁrst phase of the ongoing Cha Learn Auto ML challenge, and our comprehensive analysis on over 100 diverse datasets shows that it substantially outperforms the previous state of the art in Auto ML. We also demonstrate the performance gains due to each of our contributions and derive insights into the effectiveness of the individual components of AUTO-SKLEARN.
Researcher Affiliation	Academia	Matthias Feurer Aaron Klein Katharina Eggensperger Jost Tobias Springenberg Manuel Blum Frank Hutter Department of Computer Science University of Freiburg, Germany {feurerm,kleinaa,eggenspk,springj,mblum,fh}@cs.uni-freiburg.de
Pseudocode	Yes	Procedure 1 in the supplementary material describes it in detail.
Open Source Code	Yes	The source code of AUTO-SKLEARN is available under an open source license at https://github.com/automl/auto-sklearn.
Open Datasets	Yes	In an ofﬂine phase, for each machine learning dataset in a dataset repository (in our case 140 datasets from the Open ML [18] repository)...
Dataset Splits	Yes	Further, let Dtrain = {(x1, y1), . . . , (xn, yn)} be a training set which is split into K cross-validation folds {D(1) valid, . . . , D(K) valid} and {D(1) train, . . . , D(K) train} such that D(i) train = Dtrain\D(i) valid for i = 1, . . . , K.
Hardware Specification	No	The paper mentions 'CPU and/or wallclock time' for computational budget and '10.7 CPU years' for total experiment time, but it does not specify any particular CPU or GPU models, memory, or other hardware components used.
Software Dependencies	No	The paper mentions software frameworks like 'scikit-learn [7]', 'WEKA [8]', 'SMAC [9]', and 'Open ML [18]' but does not provide specific version numbers for these software dependencies as required for reproducibility.
Experiment Setup	Yes	To study their performance under rigid time constraints, and also due to computational resource constraints, we limited the CPU time for each run to 1 hour; we also limited the runtime for a single model to a tenth of this (6 minutes).