reproducibilityindex.ai

KNAS: Green Neural Architecture Search

Authors: Jingjing Xu, Liang Zhao, Junyang Lin, Rundong Gao, Xu Sun, Hongxia Yang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that KNAS achieves competitive results with orders of magnitude faster than train-then-test paradigms on image classiﬁcation tasks. Furthermore, the extremely low search cost enables its wide applications. The searched network also outperforms strong baseline Ro BERTA-large on two text classiﬁcation tasks.
Researcher Affiliation	Collaboration	1MOE Key Lab of Computational Linguistics, School of EECS, Peking University 2Center for Data Science, Peking University 3Alibaba Group. Correspondence to: Jingjing Xu <jingjingxu@pku.edu.cn>, Xu Sun <xusun@pku.edu.cn>.
Pseudocode	Yes	Algorithm 1 KNAS Algorithm
Open Source Code	Yes	Codes are available at https: //github.com/Jingjing-NLP/KNAS.
Open Datasets	Yes	NAS-Bench-201 (Dong & Yang, 2020) is a benchmark dataset for NAS algorithms, constructed on image classiﬁcation tasks, including CIFAR10, CIFAR100, and Image Net16-120 (Image Net-16). CIFAR10 and CIFAR100 are two widely used datasets3. CIFAR-10 consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images. CIFAR100 has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. Image Net-16 is provided by Chrabaszcz et al. (2017).
Dataset Splits	No	The paper mentions a validation set in Algorithm 1 and states that NAS-Bench-201 architectures contain 'validation accuracy', but does not specify the explicit split proportions or sample counts for the validation sets for the datasets used (CIFAR10/100/ImageNet-16).
Hardware Specification	Yes	All baselines are implemented on a single NVIDIA V100 GPU.
Software Dependencies	No	The paper mentions using Hugging Face for RoBERTa, but no specific software versions (e.g., Python version, library versions) are provided.
Experiment Setup	Yes	For all approaches, we set the time of architecture training plus evaluation to 2,160 seconds, 4,600 seconds, and 10,000 seconds on CIFAR10, CIFAR100, and Image Net-16, respectively. ... For MRPC, the batch size is set to 4, and the learning rate is set to 3e 5. For RTE, the batch size is set to 4, and the learning rate is set to 2e 5. For the rest hyper-parameters, we use the default settings.