reproducibilityindex.ai

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

Authors: Han Cai, Ligeng Zhu, Song Han

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on CIFAR-10 and Image Net demonstrate the effectiveness of directness and specialization.
Researcher Affiliation	Academia	Han Cai, Ligeng Zhu, Song Han Massachusetts Institute of Technology {hancai, ligeng, songhan}@mit.edu
Pseudocode	No	The paper includes diagrams (e.g., Figure 2) and describes algorithms in text, but it does not present any formal pseudocode or algorithm blocks.
Open Source Code	Yes	Pretrained models and evaluation code are released at https://github.com/MIT-HAN-LAB/Proxyless NAS.
Open Datasets	Yes	We demonstrate the effectiveness of our proposed method on two benchmark datasets (CIFAR-10 and Image Net) for the image classiﬁcation task.
Dataset Splits	Yes	We randomly sample 5,000 images from the training set as a validation set for learning architecture parameters which are updated using the Adam optimizer with an initial learning rate of 0.006 for the gradient-based algorithm (Section 3.2.1) and 0.01 for the REINFORCEbased algorithm (Section 3.3.2).
Hardware Specification	Yes	The GPU latency is measured on V100 GPU with a batch size of 8... The CPU latency is measured under batch size 1 on a server with two 2.40GHz Intel(R) Xeon(R) CPU E5-2640 v4. The mobile latency is measured on Google Pixel 1 phone with a batch size of 1.
Software Dependencies	No	The paper mentions "Tensor Flow-Lite" in Appendix B but does not provide specific version numbers for it or any other software dependencies. Therefore, a reproducible description is not provided.
Experiment Setup	Yes	We randomly sample 5,000 images from the training set as a validation set for learning architecture parameters which are updated using the Adam optimizer with an initial learning rate of 0.006 for the gradient-based algorithm (Section 3.2.1) and 0.01 for the REINFORCEbased algorithm (Section 3.3.2). ... After the training process of the over-parameterized network completes, a compact network is derived... Next, we train the compact network using the same training settings except that the number of training epochs increases from 200 to 300.