reproducibilityindex.ai

Optimizing Hyperparameters with Conformal Quantile Regression

Authors: David Salinas, Jacek Golebiowski, Aaron Klein, Matthias Seeger, Cedric Archambeau

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We run empirical evaluations on a large set of benchmarks, demonstrating that quantile regression surrogates achieve a more robust performance compared to state-of-the-art methods in the single-ﬁdelity case
Researcher Affiliation	Industry	1Amazon Web Services. Correspondence to: David Salinas <david.salinas.pro@gmail.com>.
Pseudocode	Yes	Algorithm 1 CQR candidate suggestion pseudo-code.
Open Source Code	Yes	The code to reproduce our results is available at https://github.com/geoalgo/syne-tune/tree/icml_conformal.
Open Datasets	Yes	Our experiments rely on 13 tasks coming from FCNet (Klein & Hutter, 2019), NAS201 (Dong & Yang, 2020) and LCBench (Zimmer et al., 2021) benchmarks as well as NAS301 (Siems et al., 2020) using the implementation provided in (Pﬁsterer et al., 2022).
Dataset Splits	Yes	Dtrain, Dval = split train val(D) (from Algorithm 1). All runs are repeated with 30 different random seeds (from Experiment Setup). In each case, we draw a random subset of size n to train the surrogate model and then evaluate the three metrics on remaining unseen examples.
Hardware Specification	Yes	We use the simulation backend provided by Syne Tune (Salinas et al., 2022) on a AWS m5.4xlarge machine to simulate methods which allows to account for both optimizers and blackbox runtimes.
Software Dependencies	No	We use gradient boosted trees (Friedman, 2001) for the quantile-regression models... BORE is evaluated with XGBoost as the classiﬁer... No specific version numbers for these software components are provided.
Experiment Setup	Yes	All tuning experiments run asynchronously with 4 workers and are stopped when 200 rmax results were observed, which corresponds to seeing 200 different conﬁgurations for single-ﬁdelity methods, or when the wallclock time exceeded a ﬁxed budget. All runs are repeated with 30 different random seeds