Efficient Non-Parametric Optimizer Search for Diverse Tasks
Authors: Ruochen Wang, Yuanhao Xiong, Minhao Cheng, Cho-Jui Hsieh
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We extensively evaluate the proposed framework on a suite of tasks, covering a variety of models and datasets. |
| Researcher Affiliation | Academia | Ruochen Wang1, Yuanhao Xiong1, Minhao Cheng2, Cho-Jui Hsieh1 1Department of Computer Science, UCLA, 2HKUST |
| Pseudocode | Yes | Algorithm 1 in the Appendix provides a detailed summary of the complete search process. |
| Open Source Code | Yes | Our code is publicly available at https://github.com/ruocwang/enos. |
| Open Datasets | Yes | We extensively evaluate the proposed framework on a diverse set of learning tasks: digit classification with MNISTNET [10], image classification with Conv Net [15], graph learning with (Cluster-)GAT [21, 28], norm-bounded adversarial attack on robustly trained models [20, 45, 46], and BERT fine-tuning on NLP datasets [34, 56]. |
| Dataset Splits | No | The paper uses well-known datasets (e.g., MNIST, CIFAR-10) which have standard splits, but it does not explicitly state the specific train/validation/test splits (e.g., percentages or sample counts) within the paper. |
| Hardware Specification | Yes | Under this setting, our method finishes in 0.92h on RTX 2080ti, much faster than L2LGD2 (2.62h). |
| Software Dependencies | No | The paper mentions using Hugging Face implementations and Python, but does not specify exact software libraries with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x, etc.) or their specific versions. |
| Experiment Setup | Yes | Across all experiments, we limit the maximum level of MCT traversal to 4, and set the number of Monte Carlo samples to 32 (a multiple of 8 for parallelism on 8-GPU servers) for each level. This amounts to a fixed total budget of 128 evaluations. The maximum depth for the super-tree is set to 10..." and "The batch size is set to 32." and "we set = 8/255, and run each optimizer once for 100 steps on every image from the test split [46]. |