reproducibilityindex.ai

ChaCha for Online AutoML

Authors: Qingyun Wu, Chi Wang, John Langford, Paul Mineiro, Marco Rossi

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we show that Cha Cha provides good performance across a wide array of datasets when optimizing over featurization and hyperparameter decisions. We test the Cha Cha algorithm on a suite of large regression datasets from Open ML (Vanschoren et al., 2014) for two online auto ML tasks. Figure 1 shows a demonstrative result obtained by Cha Cha for tuning features interactions choices, eclipsing a widely used online learning algorithm. Further experimentation demonstrates Cha Cha is consistently near-best amongst plausible alternatives.
Researcher Affiliation	Collaboration	1Microsoft Research. Correspondence to: Qingyun Wu <qxw5138@psu.edu>, John Langford <jcl@microsoft.com>.
Pseudocode	Yes	Algorithm 1 Cha Cha; Algorithm 2 Schedule(b, B, S); Algorithm 3 Choose(S)
Open Source Code	Yes	Our method is open-sourced in the Auto ML Libriary FLAML2. Please ﬁnd a demonstration of usage in this notebook3. 2https://github.com/microsoft/FLAML/tree/main/flaml/onlineml 3https://github.com/microsoft/FLAML/blob/main/notebook/flaml_autovw.ipynb
Open Datasets	Yes	We evaluate our method on a set of large scale (# of instance: 10K to 1M) regression datasets from Open ML (in total 40). All the datasets are publicly available in Open ML4. 4https://www.openml.org/search?type=data
Dataset Splits	No	The paper uses 'progressive validation loss' as an evaluation metric in an online learning setting, but it does not specify traditional train/validation/test dataset splits (e.g., percentages or sample counts) as is common in batch learning.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies	No	The paper mentions using 'Vowpal Wabbit' for evaluation but does not specify a version number. No other software dependencies with version numbers are listed.
Experiment Setup	Yes	We perform the main evaluation under the constraint that a maximum of 5 live learners are allowed, i.e., b = 5. We use the default conﬁguration in VW as the the initial conﬁguration cinit: no feature interactions, and the learning rate is 0.5. We use the VW default learning algorithm (which uses a variant of online gradient descent) as the base learner. for all the experiments, we run each method 5 times with different settings of random seed