reproducibilityindex.ai

Adaptive Stochastic Natural Gradient Method for One-Shot Neural Architecture Search

Authors: Youhei Akimoto, Shinichi Shirakawa, Nozomu Yoshinari, Kento Uchida, Shota Saito, Kouhei Nishida

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Despite its simplicity and no problem-dependent parameter tuning, our method exhibited near state-of-the-art performances with low computational budgets both on image classiﬁcation and inpainting tasks.
Researcher Affiliation	Collaboration	1University of Tsukuba & RIKEN AIP 2Yokohama National University 3Skill Up AI Co., Ltd. 4Shinshu University.
Pseudocode	Yes	Algorithm 1 ASNG-NAS
Open Source Code	Yes	The code is available at https://github.com/shirakawas/ASNG-NAS.
Open Datasets	Yes	We use the CIFAR-10 dataset and adopt the standard preprocessing and data augmentation as done in the previous works, e.g., Liu et al. (2019); Pham et al. (2018). We use the Celeb Faces Attributes Dataset (Celeb A) (Liu et al., 2015).
Dataset Splits	Yes	During the architecture search, we split the training dataset into halves as D = {Dx, Dθ} as done in Liu et al. (2019).
Hardware Specification	Yes	The experiments were done with a single NVIDIA GTX 1080Ti GPU
Software Dependencies	Yes	ASNG-NAS is implemented using Py Torch 0.4.1 (Paszke et al., 2017).
Experiment Setup	Yes	In the architecture search phase, we optimize x and θ for 100 epochs (about 40K iterations) with a mini-batch size of 64. We use SGD with a momentum of 0.9 to optimize weights x. The step-size ϵx changes from 0.025 to 0 following the cosine schedule (Loshchilov & Hutter, 2017). After the architecture search phase, we retrain the network with the most likely architecture, ˆc = argmaxc pθ(c), from scratch, which is a commonly used technique (Brock et al., 2018; Liu et al., 2019; Pham et al., 2018) to improve ﬁnal performance. In the retraining stage, we can exclude the redundant (unused) weights. Then, we optimize x for 600 epochs with a mini-batch size of 80.