A Study on Encodings for Neural Architecture Search

Authors: Colin White, Willie Neiswanger, Sam Nolen, Yash Savani

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present our experimental results. All of our experiments follow the Best Practices for NAS checklist [9]. We discuss our adherence to these practices in the full version of this paper. In particular, we release our code at https://github.com/naszilla/naszilla. We run experiments on three search spaces which we describe below. The NASBench-101 dataset [24] consists of approximately 423,000 neural architectures pretrained on CIFAR-10.
Researcher Affiliation Collaboration Colin White Abacus.AI San Francisco, CA 94103 colin@abacus.ai Willie Neiswanger Stanford University and Petuum, Inc. Stanford, CA 94305 neiswanger@cs.stanford.edu Sam Nolen Abacus.AI San Francisco, CA 94103 sam@abacus.ai Yash Savani Abacus.AI San Francisco, CA 94103 yash@abacus.ai
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/naszilla/naszilla.
Open Datasets Yes We run experiments on three search spaces which we describe below. The NASBench-101 dataset [24] consists of approximately 423,000 neural architectures pretrained on CIFAR-10. The NASBench-201 dataset [1] consists of 15625 neural architectures separately trained on each of CIFAR-10, CIFAR-100, and Image Net16-120. The DARTS [10] search space is used for large-scale cell-based NAS experiments on CIFAR-10.
Dataset Splits Yes We chose the configuration that minimizes the validation loss of the NAS algorithm after 200 queries. We also test the ability of a neural predictor to generalize to new search spaces, using a given encoding. Finally, for encodings in which multiple architectures can map to the same encoding, we evaluate the average standard deviation of accuracies for the equivalence class of architectures defined by each encoding. The neural predictor is trained on 1000 architectures and predicts the validation loss of the 5000 architectures from the test search space.
Hardware Specification Yes In each experiment, we report the test error of the neural network with the best validation error after time t, for t up to 130 TPU hours.
Software Dependencies No The paper does not provide specific software names with version numbers for its dependencies (e.g., PyTorch version, Python version).
Experiment Setup Yes Existing NAS algorithms may have hyperparameters that are optimized for a specific encoding, therefore, we perform hyperparameter tuning for each encoding. We just need to be careful that we do not perform hyperparameter tuning for specific datasets (in accordance with NAS best practices [9]). Therefore, we perform the hyperparameter search on CIFAR-100 from NAS-Bench-201, and apply the results on NAS-Bench-101. We defined a search region for each hyperparameter of each algorithm, and then for each encoding, we ran 50 iterations of random search on the full hyperparameter space. We chose the configuration that minimizes the validation loss of the NAS algorithm after 200 queries.