reproducibilityindex.ai

Stronger NAS with Weaker Predictors

Authors: Junru Wu, Xiyang Dai, Dongdong Chen, Yinpeng Chen, Mengchen Liu, Ye Yu, Zhangyang Wang, Zicheng Liu, Mei Chen, Lu Yuan

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that Weak NAS costs fewer samples to find top-performance architectures on NAS-Bench-101 and NAS-Bench-201. Compared to state-of-the-art (SOTA) predictor-based NAS methods, Weak NAS outperforms all with notable margins, e.g., requiring at least 7.5x less samples to find global optimal on NAS-Bench-101. Weak NAS can also absorb their ideas to boost performance more. Further, Weak NAS strikes the new SOTA result of 81.3% in the Image Net Mobile Net Search Space.
Researcher Affiliation	Collaboration	1 Texas A&M University, 2Microsoft Corporation, 3University of Texas at Austin
Pseudocode	No	The paper describes steps for its iterative process under "Implementation Outline" in Section 2.2, but this is presented in paragraph form and not as a formal pseudocode or algorithm block.
Open Source Code	Yes	The code is available at: https://github.com/VITA-Group/Weak NAS.
Open Datasets	Yes	We used publicly available data and cited the corresponding papers, including CIFAR10[33], CIFAR100[33], Image Net16-120[34], Image Net[37], NAS-Bench-101[32], NASBench-201[31], OFA[36], TIMM Library[39]
Dataset Splits	Yes	Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See main paper Section 2.4 for predictor setup, and Section 3 for detailed setup in each dataset, and the supplemental material for Image Net training details.
Hardware Specification	Yes	Setup: For all experiments, we use an Intel Xeon E5-2650v4 CPU and a single Tesla P100 GPU, and use the Multilayer perceptron (MLP) as our default NAS predictor, unless otherwise speciﬁed. ... we adopt Py Torch and image models library (timm) [39] to implement our models and conduct all Image Net experiments using 8 Tesla V100 GPUs.
Software Dependencies	No	The paper mentions software like "Py Torch" and "image models library (timm) [39]" and "XGBoost [29]" but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	For our weak predictor, we use a 4-layer MLP with hidden layer dimension of (1000, 1000, 1000, 1000). ... we use the Gradient Boosting Regression Tree (GBRT) based on XGBoost [29], consisting of 1000 Trees. ... we use a random forest consisting of 1000 Forests. ... In Table 1, we initialize the initial Weak Predictor f1 with 100 random samples, and set M = 10, after progressively adding more weak predictors (from 1 to 191)...