Efficient Architecture Search for Diverse Tasks

Authors: Junhong Shen, Misha Khodak, Ameet Talwalkar

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate DASH on ten tasks spanning a variety of application domains such as PDE solving, protein folding, and heart disease detection.
Researcher Affiliation Academia Junhong Shen Carnegie Mellon University junhongs@andrew.cmu.edu Mikhail Khodak Carnegie Mellon University khodak@cmu.edu Ameet Talwalkar Carnegie Mellon University talwalkar@cmu.edu
Pseudocode Yes Algorithm 1 DASH
Open Source Code Yes Our code is made public at https://github.com/sjunhongshen/DASH.
Open Datasets Yes We evaluate the performance of DASH on diverse tasks using ten datasets from NAS-Bench-360 [4], a benchmark spanning multiple application domains, input dimensions, and learning objectives.
Dataset Splits Yes Each dataset is preprocessed and split using the NAS-Bench-360 script, with the training set being used for search, hyperparameter tuning, and retraining. Then, we evaluate the performance on a holdout validation set and select the configuration with the best validation score.
Hardware Specification Yes The entire DASH pipeline can be run on a single NVIDIA V100 GPU, which is also the system that we use to report the runtime cost.
Software Dependencies No The paper mentions software concepts like 'SGD optimizer' and 'Gumbel Softmax activation', but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, CUDA).
Experiment Setup Yes We use the default SGD optimizer for the WRN backbone and fix the learning rate schedule as well as the gradient clipping threshold for every task. To normalize architecture parameters into a probability distribution, we adopt the soft Gumbel Softmax activation, similar to Xie et al. [18].