reproducibilityindex.ai

Hyperparameter optimization: a spectral approach

Authors: Elad Hazan, Adam Klivans, Yang Yuan

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments for training deep neural networks on Cifar-10 show that compared to state-of-the-art tools (e.g., Hyperband and Spearmint), our algorithm ﬁnds significantly improved solutions, in some cases better than what is attainable by handtuning. In terms of overall running time (i.e., time required to sample various settings of hyperparameters plus additional computation time), we are at least an order of magnitude faster than Hyperband and Bayesian Optimization. We also outperform Random Search.
Researcher Affiliation	Collaboration	Elad Hazan Princeton University and Google Brain ehazan@cs.princeton.edu Adam Klivans Department of Computer Science University of Texas at Austin klivans@cs.utexas.edu Yang Yuan Department of Computer Science Cornell University yangyuan@cs.cornell.edu
Pseudocode	Yes	Algorithm 1 Harmonica-1; Procedure 2 Polynomial Sparse Recovery (PSR); Algorithm 3 Harmonica-q
Open Source Code	Yes	A python implementation of Harmonica can be found at https://github.com/callowbird/Harmonica
Open Datasets	Yes	Our ﬁrst experiment is over training residual network on Cifar-10 dataset9. ... 9https://github.com/facebook/fb.resnet.torch
Dataset Splits	No	The paper mentions training and test phases, but does not explicitly describe a separate validation split or how it was used in terms of percentages, counts, or methodology. It refers to "training epochs" and "test error" but not a validation set.
Hardware Specification	No	The paper mentions "GPU Day" as a unit of measurement for running time and states "6.1 GPU days" and "20 GPUs running in parallel." However, it does not specify the model or type of GPU, CPU, or any other hardware component used for the experiments.
Software Dependencies	No	The paper mentions several software tools and libraries used or compared against, such as "Spearmint6 (Snoek et al., 2012)", "Hyperband", "SH7", "Random Search", "Lasso (Tibshirani, 1996)", and provides GitHub links for "Harmonica" and "fb.resnet.torch". However, it does not specify version numbers for any of these software dependencies.
Experiment Setup	Yes	Our ﬁrst experiment is over training residual network on Cifar-10 dataset9. We included 39 binary hyperparameters, including initialization, optimization method, learning rate schedule, momentum rate, etc. Table 1 (Section C.1) details the hyperparameters considered. ... More speciﬁcally, during the feature selection stages, we run Harmonica for tuning an 8 layer neural network with 30 training epochs. ... as our base algorithm on the big 56 layer neural network for training the whole 160 epochs.