Hyperparameter Optimization Is Deceiving Us, and How to Stop It

Authors: A. Feder Cooper, Yucheng Lu, Jessica Forde, Christopher M. De Sa

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our framework enables us to prove EHPO methods that are guaranteed to be defended against deception, given bounded compute time budget t. We demonstrate our framework s utility by proving and empirically validating a defended variant of random search. Validating our defense empirically and selecting hyper-HPs. Any defense ultimately depends on the hyper-HPs it uses.
Researcher Affiliation Academia A. Feder Cooper Cornell University afc78@cornell.edu Yucheng Lu Cornell University yl2967@cornell.edu Jessica Zosa Forde Brown University jforde2@cs.brown.edu Christopher De Sa Cornell University cdesa@cs.cornell.edu
Pseudocode Yes Algorithm 1 Defense with Random Search
Open Source Code Yes All code can be found at https://github.com/pasta41/deception.
Open Datasets Yes We first reproduce Wilson et al. [72], in which the authors trained VGG16 with different optimizers on CIFAR-10 (Figure 1a).
Dataset Splits No The paper mentions 'usually split into train and validation sets' generally, and states 'The input dataset X can be split in various ways, as a function of the random seed r.' However, it does not provide specific percentages, counts, or explicit methods for the validation split used in its experiments.
Hardware Specification No The paper does not provide specific details on the hardware used for running experiments, such as GPU models, CPU types, or cloud computing instance specifications.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., 'Python 3.x,' 'PyTorch 1.x,' 'CUDA x.x') that would be necessary to replicate the experiments.
Experiment Setup Yes We change the hyper-HPs, shifting the distribution until Adam s performance starts to degrade, and use the resulting hyper-HPs ( [1010, 1012]) to run our defense (Appendix). We now run a modified version of our defended EHPO in Definition 7, described in Algorithm 1, with K R = 600 (200 logs for each optimizer). Using a budget of M = 10000 iterations, we subsample κ = 11 logs.