reproducibilityindex.ai

Supervising the Multi-Fidelity Race of Hyperparameter Configurations

Authors: Martin Wistuba, Arlind Kadra, Josif Grabocka

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the significant superiority of Dy HPO against state-of-the-art hyperparameter optimization methods through large-scale experiments comprising 50 datasets (Tabular, Image, NLP) and diverse architectures (MLP, CNN/NAS, RNN).
Researcher Affiliation	Collaboration	Martin Wistuba Amazon Web Services, Berlin, Germany marwistu@amazon.com Arlind Kadra University of Freiburg, Freiburg, Germany kadraa@cs.uni-freiburg.de Josif Grabocka University of Freiburg, Freiburg, Germany grabocka@cs.uni-freiburg.de
Pseudocode	Yes	Algorithm 1 DYHPO Algorithm
Open Source Code	Yes	Our implementation of DYHPO is publicly available.3 (Footnote 3: https://github.com/releaunifreiburg/Dy HPO)
Open Datasets	Yes	LCBench: A learning curve benchmark [Zimmer et al., 2021]... Task Set: A benchmark that features diverse tasks Metz et al. [2020]... NAS-Bench-201: A benchmark consisting of 15625 hyperparameter configurations representing different architectures on the CIFAR-10, CIFAR-100 and Image Net datasets Dong and Yang [2020].
Dataset Splits	No	The paper describes the benchmarks used (LCBench, Task Set, NAS-Bench-201) and their characteristics, but does not explicitly state the training/validation/test dataset splits used for the experiments conducted in this paper, nor does it refer to specific predefined splits within those benchmarks that they utilized.
Hardware Specification	Yes	We ran all of our experiments on an Amazon EC2 M5 Instance (m5.xlarge).
Software Dependencies	No	The paper does not explicitly provide a list of specific software dependencies with their version numbers (e.g., Python, PyTorch, TensorFlow, or other libraries with version numbers).
Experiment Setup	Yes	For DYHPO, we use a constant learning rate of 0.1 for training the kernel parameters, and we train for 100 iterations per step. For all methods, we use a single learning rate of 0.001.