reproducibilityindex.ai

InstaNAS: Instance-Aware Neural Architecture Search

Authors: An-Chieh Cheng, Chieh Hubert Lin, Da-Cheng Juan, Wei Wei, Min Sun3577-3584

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments within a search space inspired by Mobile Net V2 show Insta NAS can achieve up to 48.8% latency reduction without compromising accuracy on a series of datasets against Mobile Net V2.
Researcher Affiliation	Collaboration	1National Tsing-Hua University, Hsinchu, Taiwan 2Google Research, Mountain View, USA, 3Appier Inc., Taiwan 4MOST Joint Research Center for AI Technology and All Vista Healthcare, Taiwan
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access to source code or explicitly state that the code is open-sourced.
Open Datasets	Yes	We validate Insta NAS on CIFAR-10/100 with the search space described in the previous section. ... Experiments on Tiny Image Net and Image Net.
Dataset Splits	No	The paper mentions using CIFAR-10/100, Tiny Image Net and Image Net, and discusses training stages, but does not provide specific percentages or counts for validation splits, or reference predefined splits.
Hardware Specification	No	For a fair comparison, all CPU latencies are measured in the same work station and the same framework (Py Torch v1.0.0). No specific details about the CPU model, GPU, or other hardware specifications are provided.
Software Dependencies	Yes	For a fair comparison, all CPU latencies are measured in the same work station and the same framework (Py Torch v1.0.0).
Experiment Setup	Yes	For pre-training the meta-graph, we use Stochastic Gradient Descent optimizer with initial learning rate 0.1. After the joint training ends, some controllers are picked by human preference by considering the accuracy and latency trade-off. At this point, the accuracy measured in the joint training stage can only consider as a reference value, the meta-graph needs to re-train from scratch with respect to the picked policy. We use Adam optimizer with learning rate 0.01 and decays with cosine annealing. ... we apply random copping, random horizontal ﬂipping, and cut-out (De Vries and Taylor 2017) as data augmentation methods.