reproducibilityindex.ai

SNAS: stochastic neural architecture search

Authors: Sirui Xie, Hehui Zheng, Chunxiao Liu, Liang Lin

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, SNAS shows strong performance compared with DARTS and all other existing NAS methods in terms of test error, model complexity and searching resources. Specifically, SNAS discovers novel convolutional cells achieving 2.85 0.02% test error on CIFAR-10 with only 2.8M parameters, which is better than 3.00 0.14%-3.3M from 1st-order DARTS and 2.89%-4.6M from ENAS.
Researcher Affiliation	Industry	Sirui Xie, Hehui Zheng, Chunxiao Liu, Liang Lin Sense Time {xiesirui, zhenghehui, liuchunxiao}@sensetime.com linliang@ieee.org
Pseudocode	No	The paper describes methods through mathematical equations and textual descriptions, but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions publicly released code by other researchers (Liu et al., 2019 and Pham et al., 2018) but does not provide a statement or link for the open-source code of SNAS itself.
Open Datasets	Yes	Dataset CIFAR-10 dataset (Krizhevsky & Hinton, 2009) is a basic dataset for image classification, which consists of 50,000 training images and 10,000 testing images. and The discovered cell achieves 27.3% top-1 error when transferred to Image Net (mobile setting)...
Dataset Splits	Yes	Dataset CIFAR-10 dataset (Krizhevsky & Hinton, 2009) is a basic dataset for image classification, which consists of 50,000 training images and 10,000 testing images. Data transformation is achieved by the standard data pre-processing and augmentation techniques (see Appendix G.1). and normalizing the training and validation images by subtracting the channel mean and dividing by the channel standard deviation.
Hardware Specification	Yes	All the experiments were performed using NVIDIA TITAN Xp GPUs
Software Dependencies	No	The paper mentions 'Py Torch' as the implementation framework but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	The neural operation parameters θ are optimized using momentum SGD, with initial learning rate ηθ = 0.025 (annealed down to zero following a cosine schedule), momentum 0.9, and weight decay 3 10 4. The architecture distribution parameters α are optimized by Adam, with initial learning rate ηα = 3 10 4, momentum β = (0.5, 0.999) and weight decay 10 3. The batch size employed is 64 and the initial number of channels is 16.