Neural Architecture Generator Optimization
Authors: Robin Ru, Pedro Esperança, Fabio Maria Carlucci
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of this strategy on six benchmark datasets and show that our search space generates extremely lightweight yet highly competitive models. The code is available at https://github.com/rubinxin/vega_NAGO. 4 Experiments. We perform experiments on a variety of image datasets: CIFAR10, CIFAR100 [45], IMAGENET [46] for object recognition; SPORT8 for action recognition [47]; MIT67 for scene recognition [48]; FLOWERS102 for fine-grained object recognition [49]. HNAG clearly outperforms RNAG (Table 4). Moreover, in the multiobjective case, HNAG-MOBO is able to find models which are not only competitive in test accuracy but also very lightweight (i.e., consuming only 1/3 of the memory when compared to RNAG-D). |
| Researcher Affiliation | Collaboration | Binxin Ru Machine Learning Research Group University of Oxford, UK robin@robots.ox.ac.uk Pedro M. Esperança Huawei Noah s Ark Lab London, UK pedro.esperanca@huawei.com Fabio M. Carlucci Huawei Noah s Ark Lab London, UK fabio.maria.carlucci@huawei.com |
| Pseudocode | Yes | The general algorithm for applying BO to our search space is presented in Appendix B. (In Appendix B): Algorithm 1: General BO-based NAS Algorithm |
| Open Source Code | Yes | The code is available at https://github.com/rubinxin/vega_NAGO. |
| Open Datasets | Yes | Datasets. We perform experiments on a variety of image datasets: CIFAR10, CIFAR100 [45], IMAGENET [46] for object recognition; SPORT8 for action recognition [47]; MIT67 for scene recognition [48]; FLOWERS102 for fine-grained object recognition [49]. |
| Dataset Splits | No | The paper mentions 'validation accuracy' (e.g., in Figure 3 and Section 4.2) and describes training protocols, but does not explicitly state the specific train/validation/test split percentages or sample counts used for the datasets like CIFAR10/100 or ImageNet within the main text. |
| Hardware Specification | Yes | All experiments use NVIDIA Tesla V100 GPUs. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies such as PyTorch, TensorFlow, or specific libraries used in the experimental setup. |
| Experiment Setup | Yes | For all datasets except IMAGENET, we evaluate the performance of the (Pareto) optimal generators recommended by BO by sampling 8 networks from the generator and training them to completion (600 epochs) with initial learning rate 0.025 and batch size 96. For IMAGENET, we follow the complete training protocol of small model regime in [14], which trains the networks for 250 epochs with an initial learning rate of 0.1 and a batch size of 256. We use cutout with length 16 for small-image tasks and size 112 for large-image tasks. BOHB is used to find the optimal network generator hyperparameters in terms of the validation accuracy. We perform BOHB for 60 iterations. We use training budgets of 100, 200, 400 epochs to evaluate architectures on SPORT8 and 30, 60, 120 epochs on the other datasets. |