Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS
Authors: Han Shi, Renjie Pi, Hang Xu, Zhenguo Li, James Kwok, Tong Zhang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted to verify the effectiveness of our method over competing algorithms.1 |
| Researcher Affiliation | Collaboration | Han Shi1 , Renjie Pi2 , Hang Xu3, Zhenguo Li3, James T. Kwok1, Tong Zhang1 1Hong Kong University of Science and Technology, Hong Kong {hshiac,jamesk}@cse.ust.hk, tongzhang@ust.hk 2The University of Hong Kong, Hong Kong pipilu@hku.hk 3Huawei Noah s Ark Lab {xu.hang,li.zhenguo}@huawei.com |
| Pseudocode | Yes | Algorithm 1 Generic BO procedure for NAS. and Algorithm 2 BONAS. |
| Open Source Code | Yes | 1The code is available at https://github.com/pipilurj/BONAS. |
| Open Datasets | Yes | In the following experiments, we use NAS-Bench-101 [38], which is the largest NAS benchmark data set (with 423K convolutional architectures), and the more recent NAS-Bench-201 [8], which uses a different search space (with 15K architectures) and is applicable to almost any NAS algorithm. and a new NAS benchmark data set LSTM-12K we recently collected for LSTMs. and on the Penn Tree Bank data set [21] and CIFAR-10 data set. and Image Net [6]. |
| Dataset Splits | Yes | For each data set, we use 85% of the data for training, 10% for validation, and the rest for testing. |
| Hardware Specification | Yes | All experiments are performed on NVIDIA Tesla V100 GPUs. |
| Software Dependencies | No | The paper mentions software like 'Adam optimizer' but does not provide specific version numbers for any key software components or libraries. |
| Experiment Setup | Yes | The GCN has four hidden layers with 64 units each. Training is performed by minimizing the square loss, using the Adam optimizer [12] with a learning rate of 0.001 and a mini-batch size of 128. and In step 10 of Algorithm 2, k = 100 models are merged to a super-network and trained for 100 epochs using the procedure discussed in Section 3.2. |