Neural Architecture Optimization

Authors: Renqian Luo, Fei Tian, Tao Qin, Enhong Chen, Tie-Yan Liu

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that the architecture discovered by our method is very competitive for image classification task on CIFAR-10 and language modeling task on PTB, outperforming or on par with the best results of previous architecture search methods with a significantly reduction of computational resources.
Researcher Affiliation Collaboration 1University of Science and Technology of China, Hefei, China 2Microsoft Research, Beijing, China
Pseudocode Yes Algorithm 1 Neural Architecture Optimization
Open Source Code Yes Our codes and model checkpoints are available at https://github.com/renqianluo/NAO.
Open Datasets Yes Experiments show that the architecture discovered by our method is very competitive for image classification task on CIFAR-10 and language modeling task on PTB
Dataset Splits Yes The performance predictor f : E R+ is another important module accompanied with the encoder. It maps the continuous representation ex of an architecture x into its performance sx measured by dev set accuracy.
Hardware Specification Yes We use 200 V100 GPU cards to complete all the process within 1 day. ... We use 200 P100 GPU cards to complete all the process within 1.5 days.
Software Dependencies No The paper does not explicitly mention specific software dependencies with version numbers.
Experiment Setup Yes The architecture encoder of NAO is an LSTM model with token embedding size and hidden state size respectively set as 32 and 96. The encoder, performance predictor and decoder of NAO are trained using Adam for 1000 epochs with a learning rate of 0.001. The trade-off parameters in Eqn. (1) is λ = 0.9. The step size to perform continuous optimization is η = 10.