reproducibilityindex.ai

Weighted Sampling for Combined Model Selection and Hyperparameter Tuning

Authors: Dimitrios Sarigiannis, Thomas Parnell, Haralampos Pozidis5595-5603

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We select three popular model-free hyperparameter tuning algorithms and perform a large empirical study, using 67 datasets from Open ML (Vanschoren et al. 2013), with uniform sampling as well as the proposed scheme.
Researcher Affiliation	Industry	Dimitrios Sarigiannis, Thomas Parnell, Haralampos Pozidis IBM Research S aumerstrasse 4, 8803 R uschlikon, Switzerland saridimi@gmail.com, {tpa, hap}@zurich.ibm.com
Pseudocode	Yes	Algorithm 1 Successive Halving Require: initial number of conﬁgurations n0, minimum resource rmin, scaling factor η, sampling distribution p(λ, α) 1: smax logη(rmin) Ensure: n0 >= ηsmax 2: T sample configurations(n, p(λ, α)) 3: for i {0, 1, ..., smax} do 4: ni nη i 5: ri η smax+i 6: L eval and return val loss(θ, ri) : θ T 7: T top k(T, L, ni/η) 8: end for 9: return Conﬁguration with the smallest intermediate loss seen so far in T
Open Source Code	No	No explicit statement about providing open-source code for the methodology or a link to a code repository was found.
Open Datasets	Yes	All datasets were obtained from the Open ML platform (Vanschoren et al. 2013) and their characteristics are summarized in Table 2. A complete list of Open ML dataset IDs is provided in Appendix A, and the pre-processing scheme used is provided in Appendix B.
Dataset Splits	Yes	Firstly, we create a stratiﬁed train/test split of each dataset. We then perform 10 different stratiﬁed splits of the training set to create a collection of 10 different train/validation sets.
Hardware Specification	No	No specific hardware specifications (e.g., GPU/CPU models, memory, or specific cloud instance types) used for running experiments were mentioned in the paper.
Software Dependencies	Yes	For XGBoost we have used the xgboost v0.82 library and for the rest of the classiﬁers we have used scikit-learn v0.21.2. All of the above methods are implemented in the R package SCMAMP (Calvo and Santaf e Rodrigo 2016), which we will make extensive use of in the following section.
Experiment Setup	Yes	In our ﬁrst comparison, we compare the three different SH schedules deﬁned in Table 1 with a budget of 33, so that in the most explorative schedule n0 = 99 conﬁgurations are evaluated in the ﬁrst rung. For each schedule, we evaluate SH with uniform model sampling and also with the weighted model sampling deﬁned in equation (5). The hyperparameters for each model are sampled uniformly from a ﬁxed range in both cases (possibly with some logarithmic transformations).