reproducibilityindex.ai

Zero-Cost Proxies for Lightweight NAS

Authors: Mohamed S Abdelfattah, Abhinav Mehrotra, Łukasz Dudziak, Nicholas Donald Lane

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we evaluate conventional reduced-training proxies and quantify how well they preserve ranking between neural network models during search when compared with the rankings produced by ﬁnal trained accuracy. We propose a series of zero-cost proxies... Our zero-cost proxies use 3 orders of magnitude less computation but can match and even outperform conventional proxies. 4 EMPIRICAL EVALUATION OF PROXY TASKS
Researcher Affiliation	Collaboration	Mohamed S. Abdelfattah1, Abhinav Mehrotra1, Łukasz Dudziak1, Nicholas D. Lane1,2 1 Samsung AI Center, Cambridge 2 University of Cambridge mohamed1.a@samsung.com
Pseudocode	No	The paper describes methods using prose and mathematical formulas, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is made public at: https://github.com/mohsaied/zero-cost-nas.
Open Datasets	Yes	NAS-Bench-201 CIFAR-10, NAS-Bench-201 on CIFAR-100, CIFAR-10, CIFAR-100 (Krizhevsky, 2009) and SVHN (Netzer et al., 2011), and ~200 models for Image Net1k (Deng et al., 2009). NAS-Bench-ASR... evaluated on the TIMIT dataset (Garofolo et al., 1993). NAS-Bench-101... with over 423k CNN models and training statistics on CIFAR-10 (Ying et al., 2019).
Dataset Splits	Yes	Figure 2: Correlation of validation accuracy to ﬁnal test accuracy during the ﬁrst 12 epochs of training for three datasets on the NAS-Bench-201 search space. The full conﬁguration training of NAS-Bench-201 on CIFAR-10 uses input resolution r=32, number of channels in the stem convolution c=16 and number of epochs e=200
Hardware Specification	Yes	We used Nvidia Geforce GTX 1080 Ti and ran a random sample of 10 models for 10 epochs to get an average time-per-epoch for each proxy at different batch sizes.
Software Dependencies	No	The paper mentions software like 'Py Torch CV' and 'REINFORCE algorithm' but does not provide specific version numbers for any software dependencies (e.g., PyTorch version, Python version, CUDA version).
Experiment Setup	Yes	In Table 6 we list the hyper-parameters used in training the Eco NAS proxies to produce Figure 1. The only difference to the standard NAS-Bench-201 training pipeline (Dong & Yang, 2020) is our use of fewer epochs for the learning rate annealing schedule we anneal the learning rate to zero over 40 epochs instead of 200. This is a common technique used in speeding up convergence for training proxies Zhou et al. (2020). Table 6: Eco NAS training hyper-parameters for NAS-Bench-201. optimizer SGD initial LR 0.1 Nesterov ﬁnal LR 0 momentum 0.9 LR schedule cosine weight decay 0.0005 epochs 40 random ﬂip (p=0.5) batch size 256 random crop. For all NAS experiments, we repeat experiments 32 times and we plot the median and shade between the lower/upper quartiles.