EG-NAS: Neural Architecture Search with Fast Evolutionary Exploration

Authors: Zicheng Cai, Lei Chen, Peng Liu, Tongtao Ling, Yutao Lai

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on various datasets and search spaces demonstrate that EG-NAS achieves highly competitive performance at significantly low search costs compared to state-of-the-art methods. The architectures discovered not only achieve 97.47% accuracy on CIFAR10, but also demonstrate 74.4% top-1 accuracy when transferred to Image Net. Moreover, we directly performed the search and evaluation on Image Net, achieving an outstanding 75.1% top-1 accuracy with a search cost of just 1.2 GPU-Days on 2 RTX 4090 GPUs, which is the optimal search speed compared to state-of-the-art methods. In this part, we conduct extensive experiments to evaluate our approach, EG-NAS, on the DARTS search space with CIFAR-10, CIFAR-100, and Image Net for image classification, as well as the NAS-Bench-201 search space with CIFAR-10, CIFAR-100, and Image Net-16-120.
Researcher Affiliation Collaboration Zicheng Cai1,2, Lei Chen1, Peng Liu2,3, Tongtao Ling1, Yutao Lai1 1Guangdong University of Technology 2Ping An Technology (Shenzhen) Co., Ltd. 3The Hong Kong Polytechnic University chenlei3@gdut.edu.cn
Pseudocode Yes Algorithm 1: Main framework of EG-NAS
Open Source Code Yes The code is available at https://github.com/caicaicheng/EG-NAS.
Open Datasets Yes Extensive experiments on various datasets and search spaces demonstrate that EG-NAS achieves highly competitive performance at significantly low search costs compared to state-of-the-art methods. In this part, we conduct extensive experiments to evaluate our approach, EG-NAS, on the DARTS search space with CIFAR-10, CIFAR-100, and Image Net for image classification, as well as the NAS-Bench-201 search space with CIFAR-10, CIFAR-100, and Image Net-16-120.
Dataset Splits Yes The training set of CIFAR-10 is divided into two parts of equal size, one for optimizing the network parameters by gradient descent and the other for obtaining the search direction. Accval means the validation accuracy, x n is the search direction that yields the best fitness value by sampling the evolutionary strategy, and ωt 1 is the network parameters at (t 1) step. Select the best search direction st+1 based on the validation accuracy from D = {x 1, x 2, . . . , x N}.
Hardware Specification Yes All experiments were conducted on a single Nvidia RTX 3090, except for the Image Net experiments, which were conducted on 2 RTX 4090.
Software Dependencies No The paper mentions the use of optimizers (SGD) but does not provide specific version numbers for any software dependencies like Python, PyTorch, TensorFlow, or other libraries.
Experiment Setup Yes We train the supernet for 50 epochs (the first 15 epochs for warm-up) with a batch size of 256 and retrained the network from scratch from 600 epochs, with a batch size of 96. In ES, the population size λ is set as 25 and the coefficients ζ and η for L1 and L2 are set as 1.0 and 0.4, respectively. The step size ξ, the sample numbers N, and the initial channel number were assigned to 0.6, 5, and 16, respectively. We employ an SGD optimizer with a linearly decayed learning rate initialized at 0.5, a momentum of 0.9, and a weight decay of 3 10 5.