Efficient Architecture Search for Diverse Tasks
Authors: Junhong Shen, Misha Khodak, Ameet Talwalkar
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate DASH on ten tasks spanning a variety of application domains such as PDE solving, protein folding, and heart disease detection. |
| Researcher Affiliation | Academia | Junhong Shen Carnegie Mellon University junhongs@andrew.cmu.edu Mikhail Khodak Carnegie Mellon University khodak@cmu.edu Ameet Talwalkar Carnegie Mellon University talwalkar@cmu.edu |
| Pseudocode | Yes | Algorithm 1 DASH |
| Open Source Code | Yes | Our code is made public at https://github.com/sjunhongshen/DASH. |
| Open Datasets | Yes | We evaluate the performance of DASH on diverse tasks using ten datasets from NAS-Bench-360 [4], a benchmark spanning multiple application domains, input dimensions, and learning objectives. |
| Dataset Splits | Yes | Each dataset is preprocessed and split using the NAS-Bench-360 script, with the training set being used for search, hyperparameter tuning, and retraining. Then, we evaluate the performance on a holdout validation set and select the configuration with the best validation score. |
| Hardware Specification | Yes | The entire DASH pipeline can be run on a single NVIDIA V100 GPU, which is also the system that we use to report the runtime cost. |
| Software Dependencies | No | The paper mentions software concepts like 'SGD optimizer' and 'Gumbel Softmax activation', but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, CUDA). |
| Experiment Setup | Yes | We use the default SGD optimizer for the WRN backbone and fix the learning rate schedule as well as the gradient clipping threshold for every task. To normalize architecture parameters into a probability distribution, we adopt the soft Gumbel Softmax activation, similar to Xie et al. [18]. |