reproducibilityindex.ai

How Powerful are Performance Predictors in Neural Architecture Search?

Authors: Colin White, Arber Zela, Robin Ru, Yang Liu, Frank Hutter

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study 31 predictors across four popular search spaces and four datasets: NAS-Bench-201 [13] with CIFAR-10, CIFAR-100, and Image Net16-120, NAS-Bench-101 [71] and DARTS [32] with CIFAR-10, and NAS-Bench-NLP [21] with Penn Tree Bank. In order to give a fair comparison among different classes of predictors, we run a full portfolio of experiments, measuring the Pearson correlation and rank correlation metrics (Spearman, Kendall Tau, and sparse Kendall Tau), across a variety of initialization time and query time budgets.
Researcher Affiliation	Collaboration	Colin White1 , Arber Zela2, Binxin Ru3, Yang Liu1, Frank Hutter2,4 1 Abacus.AI, 2 University of Freiburg, 3 University of Oxford, 4 Bosch Center for Artiﬁcial Intelligence
Pseudocode	Yes	We give results in Figure 3 and pseudo-code as well as additional experiments in Section B.4.
Open Source Code	Yes	Our code, featuring a library of 31 performance predictors, is available at https://github.com/automl/naslib.
Open Datasets	Yes	NAS-Bench-101 [71] consists of over 423 000 unique neural architectures with precomputed training, validation, and test accuracies after training for 4, 12, 36, and 108 epochs on CIFAR-10 [71].
Dataset Splits	Yes	NAS-Bench-101 [71] consists of over 423 000 unique neural architectures with precomputed training, validation, and test accuracies after training for 4, 12, 36, and 108 epochs on CIFAR-10 [71]. ... NAS-Bench-201 [13] consists of 15 625 architectures ... Each architecture has full learning curve information for training, validation, and test losses/accuracies for 200 epochs on CIFAR-10 [22], CIFAR-100, and Image Net-16-120 [10].
Hardware Specification	Yes	On NAS-Bench-201 CIFAR-10, the 11 initialization time budgets are spaced logarithmically from 1 second to 1.8 107 seconds on a 1080 Ti GPU (which corresponds to training 1000 random architectures on average)
Software Dependencies	No	The paper mentions its code is based on the NASLib library, but it does not provide specific version numbers for NASLib or any other software dependencies (e.g., Python, PyTorch, etc.) in the main text.
Experiment Setup	Yes	Hyperparameter tuning. Although we used the code directly from the original repositories (sometimes making changes when necessary to adapt to NAS-Bench search spaces), the predictors had signiﬁcantly different levels of hyperparameter tuning. ... For each search space, we run random search on each performance predictor for 5000 iterations, with a maximum total runtime of 15 minutes. ... The hyperparameter value ranges for each predictor can be found in Section B.2.