reproducibilityindex.ai

Generalizing Few-Shot NAS with Gradient Matching

Authors: Shoukang Hu, Ruochen Wang, Lanqing HONG, Zhenguo Li, Cho-Jui Hsieh, Jiashi Feng

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive empirical evaluations of the proposed method on a wide range of search spaces (NASBench-201, DARTS, Mobile Net Space), datasets (cifar10, cifar100, Image Net) and search algorithms (DARTS, SNAS, RSPS, Proxyless NAS, OFA) demonstrate that it signiﬁcantly outperforms its Few-Shot counterparts while surpassing previous comparable methods in terms of the accuracy of derived architectures.
Researcher Affiliation	Collaboration	1The Chinese University of Hong Kong 2University of California, Los Angeles 3Huawei Noah s Ark Lab 4National University of Singapore
Pseudocode	Yes	A PSEUDOCODE FOR GM-NAS
Open Source Code	Yes	Our code is available at https://github.com/skhu101/GM-NAS.
Open Datasets	Yes	We benchmark the proposed method on the full NASBench-201 Space (Dong & Yang, 2020) with ﬁve operations (none, skip, conv 1x1, conv 3x3, avgpool 3x3).
Dataset Splits	Yes	Table 8: Performance comparison among derived child networks using different supernet selection criteria in Few-Shot NAS and GM-NAS
Hardware Specification	No	The paper mentions 'GPU hours' for search cost but does not specify any particular GPU models, CPU models, or other hardware used for experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or library versions).
Experiment Setup	Yes	The derived architecture is trained from scratch with a batch size 96 for 600 epochs. We use SGD with an initial learning rate of 0.0025, a momentum of 0.9, and a weight decay of 3 10 4, and a cosine learning rate scheduler. In addition, we also deploy the cutout regularization with length 16, drop-path with probability 0.3, and an auxiliary tower of weight 0.4.