NASGEM: Neural Architecture Search via Graph Embedding Method
Authors: Hsin-Pai Cheng, Tunhou Zhang, Yixing Zhang, Shiyu Li, Feng Liang, Feng Yan, Meng Li, Vikas Chandra, Hai Li, Yiran Chen7090-7098
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply NASGEM to search efficient mobile neural architectures on image classification and object detection tasks. We initiate architecture search with a complete DAG of N = 30 so that a rich source of cells can be constructed. Each node can choose from either 1 × 1 convolution or 3 × 3 depthwise separable convolution as its candidate operation. Following the common practice in NAS, each convolution operation adopts a Convolution-Batch Norm-Re LU triplet (Xie et al. 2019a; Liu, Simonyan, and Yang 2019). We use 1/10 of the CIFAR-10 dataset as a proxy to evaluate performance, and use the obtained performance to train the predictor. Table 1 summarizes the key performance metrics on the Image Net dataset. Table 2: Results on MS COCO dataset. |
| Researcher Affiliation | Collaboration | 1 Duke University 2 Facebook, Inc 3 Tsinghua University 4 University of Nevada, Reno |
| Pseudocode | No | The paper mentions that "Predictor training algorithm and code implementation can be found in Supplementary Material" and "The procedure and code of measuring the similarity between two input graphs is provided in the supplementary material," implying pseudocode or algorithms exist but are not presented within the main paper. |
| Open Source Code | Yes | Predictor training algorithm and code implementation can be found in Supplementary Material. The procedure and code of measuring the similarity between two input graphs is provided in the supplementary material. |
| Open Datasets | Yes | We use 1/10 of the CIFAR-10 dataset as a proxy to evaluate performance, and use the obtained performance to train the predictor. We conduct object detection experiments on the challenging MS COCO dataset (Lin et al. 2014). We use the whole COCO trainval135 as training set and validate on COCO minival. To justify the effectiveness of NASGEM, we further perform evaluation on NASBench-101 (Ying et al. 2019). |
| Dataset Splits | Yes | ACC is the validation accuracy on proxy dataset. We use the whole COCO trainval135 as training set and validate on COCO minival. |
| Hardware Specification | No | The paper mentions "0.4 GPU days" for search cost but does not specify any particular GPU models, CPU types, or other hardware details used for running experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | We initiate architecture search with a complete DAG of N = 30 so that a rich source of cells can be constructed. Each node can choose from either 1 × 1 convolution or 3 × 3 depthwise separable convolution as its candidate operation. Each convolution operation adopts a Convolution-Batch Norm-Re LU triplet. We use 1/10 of the CIFAR-10 dataset as a proxy to evaluate performance. The predictor is a fully connected neural network with activation function Re LU. The encoder and decoder are trained for 10k iterations using 50k generated graph pairs. |