reproducibilityindex.ai

Rethinking Architecture Selection in Differentiable NAS

Authors: Ruochen Wang, Minhao Cheng, Xiangning Chen, Xiaocheng Tang, Cho-Jui Hsieh

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide empirical and theoretical analysis to show that the magnitude of architecture parameters does not necessarily indicate how much the operation contributes to the supernet s performance. We re-evaluate several differentiable NAS methods with the proposed architecture selection and ﬁnd that it is able to extract signiﬁcantly improved architectures from the underlying supernets consistently.
Researcher Affiliation	Collaboration	1Department of Computer Science, UCLA, 2Di Di AI Labs {ruocwang, mhcheng}@ucla.edu {xiangning, chohsieh}@cs.ucla.edu xiaochengtang@didiglobal.com
Pseudocode	Yes	Algorithm 1: Perturbation-based Architecture Selection; Algorithm 2: Perturbation-based Architecture Selection
Open Source Code	No	The paper does not provide an explicit statement or link for the open-source code of its methodology.
Open Datasets	Yes	The evaluation is based on the search space of DARTS and NAS-Bench-201 (Dong & Yang, 2020), and we show that the perturbation-based architecture selection method can be applied to several variants of DARTS. Every architecture in the search space is trained under the same protocol on three datasets (cifar10, cifar100, and imagenet16-120), and their performance can be obtained by querying the database.
Dataset Splits	No	The paper mentions 'validation accuracy' frequently but does not explicitly provide specific numerical splits (e.g., percentages or counts) for training, validation, and test sets. It implies standard splits are used for benchmark datasets but does not detail them.
Hardware Specification	Yes	Recorded on a single GTX 1080Ti GPU.
Software Dependencies	No	The paper does not explicitly mention specific software dependencies with version numbers.
Experiment Setup	Yes	We keep all the search and retrain settings identical to DARTS since our method only modiﬁes the architecture selection part. After the search phase, we perform perturbation-based architecture selection following Algorithm 1 on the pretrained supernet. We tune the supernet for 5 epochs between two selections as it is enough for the supernet to recover from the drop of accuracy after discretization.