reproducibilityindex.ai

UQ-Guided Hyperparameter Optimization for Iterative Learners

Authors: Jiesong Liu, Feng Zhang, Jiawei Guan, Xipeng Shen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on two widely used HPO benchmarks, NAS-BENCH-201 [9] and LCBench [42], show that the enhanced methods produce models that have 21 55% regret reduction over the models from the original methods at the same exploration cost. And those enhanced methods need only 30 75% time to produce models with accuracy comparable to those by the original HPO methods.
Researcher Affiliation	Academia	Jiesong Liu , Feng Zhang , Jiawei Guan , Xipeng Shen , Department of Computer Science, North Carolina State University School of Information, Renmin University of China
Pseudocode	Yes	Algorithm 1 UQ-Guided Hyperparameter Optimization (SH+), Algorithm 2 Oracle Model for determining K candidates into the next round., Algorithm 3 Hyperband plus (HB+), Algorithm 4 Bayesian Optimization Hyperband plus (BOHB+), Algorithm 5 Sub-Sampling plus (SS+)
Open Source Code	Yes	Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We include them in the supplemental material.
Open Datasets	Yes	We evaluate the UQ-guided methods on two real-world benchmarks. Nas-Bench-201 [9] (CC-BY 4.0) encompasses three heavyweight neural architecture search tasks (NAS) on CIFAR-10, CIFAR-100, and Image Net-16-12 (CC-BY 4.0) datasets. In addition, we investigate the performance of optimizing traditional ML pipelines, hyperparameters, and neural architecture in LCBench [42]. For example, we optimized 7 parameters for the Fashion-MNIST dataset [7]...
Dataset Splits	Yes	Table 2: Tasks Datasets Hyperparameters Fidelity # Training set # Validation set # Test set ... For LCBench: Whenever possible, we use the given test split with a 33% test split and additionally use fixed 33% of the training data as validation split.
Hardware Specification	Yes	Our experiments are conducted on a platform equipped with an Intel i9-9900k CPU and an NVIDIA GEFORCE RTX 2080 TI GPU. The CPU has 8 cores, each of which can support 2 threads. The GPU has 4,352 cores of Turing architecture with a computing capability of 7.5. The GPU can achieve a maximum memory bandwidth of 616 GB/s, 0.4 tera floating-point operations per second (TFLOPS) on double-precision, and 13 TFLOPS on single-precision.
Software Dependencies	No	The paper does not explicitly state specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	In this context, one unit of budget equates to a single training epoch, and by default, the total HPO budget (B) allocated for each method is 4 hours. ... Batch size: [16, 512], log-scale Learning rate: [1e 4, 1e 1], log-scale Momentum: [0.1, 0.99] Weight decay: [1e 5, 1e 1] Number of layers: [1, 5] Maximum number of units per layer: [64, 1024], log-scale Dropout: [0.0, 1.0]