reproducibilityindex.ai

Neural Architecture Search in A Proxy Validation Loss Landscape

Authors: Yanxi Li, Minjing Dong, Yunhe Wang, Chang Xu

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on benchmarks demonstrate that the architecture searched by the proposed algorithm can achieve a satisfactory accuracy with less time cost. Experimental results on benchmarks demonstrate that the architecture searched by the proposed algorithm can achieve a satisfactory accuracy with less time cost.
Researcher Affiliation	Collaboration	1School of Computer Science, University of Sydney 2Noah s Ark Lab, Huawei Technoligies.
Pseudocode	Yes	Algorithm 1 Loss Space Regression
Open Source Code	No	The paper does not provide an explicit statement or link to open-source code for the methodology described.
Open Datasets	Yes	Following previous works (Liu et al., 2018b; Dong & Yang, 2019), we use the CIFAR-10 dataset (Krizhevsky et al., 2009) for architecture searching and results evaluation. The CIFAR-10 dataset contains 50,000 training images together with 10,000 testing images from 10 classes. The generality of architecture we obtained is tested on Image Net 2012 (Russakovsky et al., 2015).
Dataset Splits	Yes	During the searching phase, we shufﬂe the training set and divide it into two parts with equal size for model weights training and validation performance inference respectively.
Hardware Specification	No	The paper mentions "GPU days" but does not specify any particular GPU models, CPU models, or other hardware specifications used for the experiments.
Software Dependencies	No	The paper does not provide specific software names with version numbers for reproducibility.
Experiment Setup	Yes	We set candidates number K = 8, including 4 convolutional operations... The super-network for searching is constructed by stacking 8 cells... The network has 16 initial channels... The warm-up population is initialized with 100 random sampled architectures... We trained models in the warm-up population with minibatch gradient descent, whose batch size is set to 64 and the base learning rate is set to 0.025... The architecture weights and validation loss estimator are both optimized by Adam with a constant learning rate of 0.1. The Softmax temperature τ in Gumbel-Softmax is set to 0.1. To evaluate the performance of the obtained architecture, a larger network is constructed with 20 stacked cells and 36 initial channels. The network is trained with the same training setting as in the searching phase for 600 epochs on the complete CIFAR-10 training set.