reproducibilityindex.ai

A Study on Encodings for Neural Architecture Search

Authors: Colin White, Willie Neiswanger, Sam Nolen, Yash Savani

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present our experimental results. All of our experiments follow the Best Practices for NAS checklist [9]. We discuss our adherence to these practices in the full version of this paper. In particular, we release our code at https://github.com/naszilla/naszilla. We run experiments on three search spaces which we describe below. The NASBench-101 dataset [24] consists of approximately 423,000 neural architectures pretrained on CIFAR-10.
Researcher Affiliation	Collaboration	Colin White Abacus.AI San Francisco, CA 94103 colin@abacus.ai Willie Neiswanger Stanford University and Petuum, Inc. Stanford, CA 94305 neiswanger@cs.stanford.edu Sam Nolen Abacus.AI San Francisco, CA 94103 sam@abacus.ai Yash Savani Abacus.AI San Francisco, CA 94103 yash@abacus.ai
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/naszilla/naszilla.
Open Datasets	Yes	We run experiments on three search spaces which we describe below. The NASBench-101 dataset [24] consists of approximately 423,000 neural architectures pretrained on CIFAR-10. The NASBench-201 dataset [1] consists of 15625 neural architectures separately trained on each of CIFAR-10, CIFAR-100, and Image Net16-120. The DARTS [10] search space is used for large-scale cell-based NAS experiments on CIFAR-10.
Dataset Splits	Yes	We chose the conﬁguration that minimizes the validation loss of the NAS algorithm after 200 queries. We also test the ability of a neural predictor to generalize to new search spaces, using a given encoding. Finally, for encodings in which multiple architectures can map to the same encoding, we evaluate the average standard deviation of accuracies for the equivalence class of architectures deﬁned by each encoding. The neural predictor is trained on 1000 architectures and predicts the validation loss of the 5000 architectures from the test search space.
Hardware Specification	Yes	In each experiment, we report the test error of the neural network with the best validation error after time t, for t up to 130 TPU hours.
Software Dependencies	No	The paper does not provide specific software names with version numbers for its dependencies (e.g., PyTorch version, Python version).
Experiment Setup	Yes	Existing NAS algorithms may have hyperparameters that are optimized for a speciﬁc encoding, therefore, we perform hyperparameter tuning for each encoding. We just need to be careful that we do not perform hyperparameter tuning for speciﬁc datasets (in accordance with NAS best practices [9]). Therefore, we perform the hyperparameter search on CIFAR-100 from NAS-Bench-201, and apply the results on NAS-Bench-101. We deﬁned a search region for each hyperparameter of each algorithm, and then for each encoding, we ran 50 iterations of random search on the full hyperparameter space. We chose the conﬁguration that minimizes the validation loss of the NAS algorithm after 200 queries.