reproducibilityindex.ai

ECONOMIC HYPERPARAMETER OPTIMIZATION WITH BLENDED SEARCH STRATEGY

Authors: Chi Wang, Qingyun Wu, Silu Huang, Amin Saied

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive empirical evaluation on the Auto ML Benchmark (Gijsbers et al., 2019) validates the robust performance of our method on a wide variety of datasets. Blend Search is now publicly available in an open-source Auto ML Library1.
Researcher Affiliation	Industry	Chi Wang , Qingyun Wu , Silu Huang, Amin Saied Microsoft Corporation, Redmond, WA 98052, USA {wang.chi,qingyun.wu,silu.huang,amin.saied}@microsoft.com
Pseudocode	Yes	The overall design of our framework is presented in Figure 2 and Algorithm 1. ... detailed pseudocode for several of the straightforward functions mentioned in our framework are provided in Appendix A, including Create New LSCondition, Delete And Merge LS, Initialize Search Thread and Book Keeping.
Open Source Code	Yes	Blend Search is now publicly available in an open-source Auto ML Library1. 1https://github.com/microsoft/FLAML
Open Datasets	Yes	The Auto ML benchmark consists of 39 tabular datasets that represent real-world data science classiﬁcation problems. (Gijsbers et al., 2019) ... The dataset used for training consists of 52K labeled examples, which we split 80/20 for training/validation.
Dataset Splits	Yes	The dataset used for training consists of 52K labeled examples, which we split 80/20 for training/validation. ... As each dataset has 10 cross-validation folds, all the results reported in this paper are averaged over the 10 folds.
Hardware Specification	Yes	The XGBoost and Light GBM experiments are performed in a server with Intel Xeon E5-2690 v4 2.6GHz, and 256GB RAM. ... We compare to ASHA and let it use 16 VMs with 4 NVIDIA Tesla V100 GPUs on each VM. ... All experiments for Deep Tables are performed in a server with the same CPU, 110GB RAM, and one Tesla P100 GPU.
Software Dependencies	Yes	For BO, we use implementation from Optuna 2.0.0 (https://optuna. readthedocs.io/en/stable/index.html) with default settings for TPE sampler.
Experiment Setup	Yes	We tune a set of 9-dimensional hyperparameters (all numerical) in Light GBM and 11-dimensional hyperparameters (9 numerical and 2 categorical) in XGBoost. A detailed description of the search space can be found in Appendix B. ... The input budget B (in terms of CPU time) is set to be 4 hours for the 3 largest datasets among the 37 datasets, and 1 hour for the rest.