reproducibilityindex.ai

Hyperparameter Optimization Is Deceiving Us, and How to Stop It

Authors: A. Feder Cooper, Yucheng Lu, Jessica Forde, Christopher M. De Sa

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our framework enables us to prove EHPO methods that are guaranteed to be defended against deception, given bounded compute time budget t. We demonstrate our framework s utility by proving and empirically validating a defended variant of random search. Validating our defense empirically and selecting hyper-HPs. Any defense ultimately depends on the hyper-HPs it uses.
Researcher Affiliation	Academia	A. Feder Cooper Cornell University afc78@cornell.edu Yucheng Lu Cornell University yl2967@cornell.edu Jessica Zosa Forde Brown University jforde2@cs.brown.edu Christopher De Sa Cornell University cdesa@cs.cornell.edu
Pseudocode	Yes	Algorithm 1 Defense with Random Search
Open Source Code	Yes	All code can be found at https://github.com/pasta41/deception.
Open Datasets	Yes	We first reproduce Wilson et al. [72], in which the authors trained VGG16 with different optimizers on CIFAR-10 (Figure 1a).
Dataset Splits	No	The paper mentions 'usually split into train and validation sets' generally, and states 'The input dataset X can be split in various ways, as a function of the random seed r.' However, it does not provide specific percentages, counts, or explicit methods for the validation split used in its experiments.
Hardware Specification	No	The paper does not provide specific details on the hardware used for running experiments, such as GPU models, CPU types, or cloud computing instance specifications.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., 'Python 3.x,' 'PyTorch 1.x,' 'CUDA x.x') that would be necessary to replicate the experiments.
Experiment Setup	Yes	We change the hyper-HPs, shifting the distribution until Adam s performance starts to degrade, and use the resulting hyper-HPs ( [1010, 1012]) to run our defense (Appendix). We now run a modified version of our defended EHPO in Deﬁnition 7, described in Algorithm 1, with K R = 600 (200 logs for each optimizer). Using a budget of M = 10000 iterations, we subsample κ = 11 logs.