MOTE-NAS: Multi-Objective Training-based Estimate for Efficient Neural Architecture Search
Authors: Yuming Zhang, Jun Hsieh, Xin Li, Ming-Ching Chang, Chun-Chieh Lee, Kuo-Chin Fan
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on NASBench-201 show MOTE-NAS achieves 94.32% accuracy on CIFAR-10, 72.81% on CIFAR-100, and 46.38% on Image Net-16-120, outperforming NTKbased NAS approaches. An evaluation-free (EF) version of MOTE-NAS delivers high efficiency in only 5 minutes, delivering a model more accurate than KNAS. |
| Researcher Affiliation | Academia | Yu-Ming Zhang1 Jun-Wei Hsieh2 Xin Li3 Ming-Ching Chang3 Chun-Chieh Lee1 Kuo-Chin Fan1 1National Central University 3University at Albany 2National Yang Ming Chiao Tung University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | We have described the experimental details in the paper and supplementary materials as much as possible, and we will make the code public on github. |
| Open Datasets | Yes | We used NASBench-101 and NASBench-201, both cell-based search spaces. NASBench-101 has 423,621 candidates trained on CIFAR-10 for 108 epochs. NASBench-201 includes 15,625 candidates trained on CIFAR-10, CIFAR-100, and Image Net-16-120 for 200 epochs each. We search for a promising architecture based on the mobilenet V3 search space using MOTE, then train and evaluate it on imagenet-1K. |
| Dataset Splits | No | The paper mentions 'training data' and 'test accuracy' (including for early stopping), but does not explicitly provide percentages, counts, or a detailed methodology for train/validation/test dataset splits used in their experiments. |
| Hardware Specification | Yes | Computation was on Tesla V100 GPUs, with MOTE or MOTE-NAS costs calculated specifically on V100. Following 200 epochs of training using 10 GTX 2080Ti GPUs on the imagenet-1K dataset |
| Software Dependencies | No | The paper mentions the use of 'Adam optimizer' and 'box-cox transformation', but does not provide specific version numbers for software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | The hyperparameters are batch size 256, epochs 50, learning rate 0.001 with Adam optimizer, and cross-entropy loss function. Then select the best architecture by the early stopping version of the test accuracy. |