reproducibilityindex.ai

Automated Machine Learning with Monte-Carlo Tree Search

Authors: Herilalaina Rakotoarison, Marc Schoenauer, Michèle Sebag

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive empirical studies are conducted to independently assess and compare: i) the optimization processes based on Bayesian optimization or MCTS; ii) its warm-start initialization; iii) the ensembling of the solutions gathered along the search. MOSAIC is assessed on the Open ML 100 benchmark and the Scikit-learn portfolio, with statistically signiﬁcant gains over AUTO-SKLEARN, winner of former international Auto ML challenges.
Researcher Affiliation	Academia	Herilalaina Rakotoarison , Marc Schoenauer and Mich ele Sebag TAU, LRI-CNRS INRIA Universit e Paris-Saclay, France
Pseudocode	Yes	Algorithm 1 MOSAIC Vanilla
Open Source Code	Yes	1MOSAIC is publicly available under an open source license at https://github.com/herilalaina/mosaic_ml.
Open Datasets	Yes	The compared Auto ML systems are assessed on the Open ML repository [Vanschoren et al., 2013], including 100 classiﬁcation problems.
Dataset Splits	Yes	For all systems, every considered x conﬁguration is launched to learn a model from 70% of the training set with a cut-off time of 300 seconds, and performance F(x) is set to the model accuracy on the remaining 30%.
Hardware Specification	Yes	Computational times are measured on an AMD Athlon 64 X2, 5GB RAM.
Software Dependencies	No	The paper mentions using a "scikit-learn portfolio" and comparing against other systems like AUTO-SKLEARN and TPOT, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	The overall computational budget is set to 1 hour for each dataset. ...MOSAIC involves 2 hyper-hyper-parameters...: the number ns = 100... Cucb = 1.3... PW = 0.6. Shared hyper-hyperparameters include: number nr of uniformly sampled conﬁgurations and variance ǫ = .2 for the local search in the Playout phase (Section 3.3). ...every considered x conﬁguration is launched to learn a model from 70% of the training set with a cut-off time of 300 seconds, and performance F(x) is set to the model accuracy on the remaining 30%.