reproducibilityindex.ai

AtomNAS: Fine-Grained End-to-End Neural Architecture Search

Authors: Jieru Mei, Yingwei Li, Xiaochen Lian, Xiaojie Jin, Linjie Yang, Alan Yuille, Jianchao Yang

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiment, our method achieves 75.9% top-1 accuracy on Image Net dataset around 360M FLOPs, which is 0.9% higher than state-of-the-art model (Stamoulis et al., 2019a). We ﬁrst describe the implementation details in Section 4.1 and then compare Atom NAS with previous state-of-the-art methods under various FLOPs constraints in Section 4.2. In Section 4.3, we provide more detailed analysis about Atom NAS. Finally, in Section 4.4, we demonstrate the transferability of Atom NAS networks by evaluating them on detection and instance segmentation tasks.
Researcher Affiliation	Collaboration	Jieru Mei1 , Yingwei Li1 , Xiaochen Lian2, Xiaojie Jin2, Linjie Yang2, Alan Yuille1 & Jianchao Yang2 1Johns Hopkins University 2Byte Dance AI Lab
Pseudocode	Yes	Algorithm 1: Dynamic network shrinkage
Open Source Code	Yes	We open our entire codebase at: https://github.com/meijieru/Atom NAS.
Open Datasets	Yes	We apply Atom NAS to search high performance light-weight model on Image Net 2012 classiﬁcation task (Deng et al., 2009). on COCO dataset (Lin et al., 2014).
Dataset Splits	Yes	All the models are trained on COCO train2017 with batch size 16 and evaluated on COCO val2017.
Hardware Specification	Yes	When training the supernet, we use a total batch size of 2048 on 32 Tesla V100 GPUs and train for 350 epochs.
Software Dependencies	No	The paper mentions using "RMSProp optimizer" and "MMDetection" but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	Yes	We use the same training conﬁguration (e.g., RMSProp optimizer, EMA on weights and exponential learning rate decay) as Tan et al. (2019); Stamoulis et al. (2019a)...When training the supernet, we use a total batch size of 2048 on 32 Tesla V100 GPUs and train for 350 epochs. For our dynamic network shrinkage algorithm, we set the momentum factor β in Eq. (7) to 0.9999...By setting the weight of the L1 penalty term λ to be 1.8 10 4, 1.2 10 4 and 1.0 10 4 respectively