reproducibilityindex.ai

NASPY: Automated Extraction of Automated Machine Learning Models

Authors: Xiaoxuan Lou, Shangwei Guo, Jiwei Li, Yaoxin Wu, Tianwei Zhang

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform extensive experiments to demonstrate the effectiveness of NASPY. Our identification model can predict the operation sequences of different NAS methods (DARTS (Liu et al., 2018), GDAS (Dong & Yang, 2019) and TE-NAS (Chen et al., 2021)) with an error rate of 3.2%. Our hyper-parameter prediction can achieve more than 98% accuracy. The framework also demonstrates high robustness against random noise introduced by the complex and dynamic hardware systems.
Researcher Affiliation	Collaboration	1Nanyang Technological University, 2Chongqing University, 3Zhejiang University, 4Shannon.AI
Pseudocode	Yes	Algorithm 1: GEMM in Open BLAS
Open Source Code	Yes	The source code of NASPY is available at https://github.com/Lou Xiaoxuan/NASPY.
Open Datasets	Yes	Dataset construction. We search model architectures with CIFAR10, and train model parameters over CIFAR10 and CIFAR100.
Dataset Splits	Yes	We randomly select 80% of the sequences as the training set, and the rest as the validation set.
Hardware Specification	Yes	The model is trained for 100 epochs, which takes 6.25 hours on one V100 GPU.
Software Dependencies	Yes	Without loss of generality, we adopt Pytorch (1.8.0) and Open BLAS (0.3.13).
Experiment Setup	Yes	CRNN+CTC model. This model is comprised sequentially with one convolution layer l C, one bidirectional GRU layer l R and one classifier F with two FC layers. To evaluate the capability of l C on the feature learning, both 1d and 2d convolutions are adopt in experiments for comparison. Besides, to evaluate the performance of identifiers with different model sizes, three candidate dimensions of l R (i.e., 128, 256, 512) are considered. To train the model, we use CTC loss as the criterion to bypass the sequence alignment, and we use Adam optimization. The learning rate starts from 5e-4 and is scheduled following the One Cycle LR policy (Smith & Topin, 2019). The model is trained for 100 epochs, which takes 6.25 hours on one V100 GPU.