M-NAS: Meta Neural Architecture Search

Authors: Jiaxing Wang, Jiaxiang Wu, Haoli Bai, Jian Cheng6186-6193

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate the superiority of M-NAS against a number of competitive baselines on both toy regression and few shot classification problems.Extensive experiments are conducted on both toy examples and real-world datasets. The results show that our proposed M-NAS efficiently discover proper network architecture and good parameter solutions given a specific task, outperforming a set of competitive baselines.
Researcher Affiliation Collaboration Jiaxing Wang,1,2 Jiaxiang Wu,3 Haoli Bai,4 Jian Cheng1,2,5 1NLPR, Institute of Automation, Chinese Academy of Sciences, 2University of Chinese Academy of Sciences, 3Tencent AI Lab, 4The Chinese University of Hong Kong, 5Center for Excellence in Brain Science and Intelligence Technology, CAS {jiaxing.wang, jcheng}@nlpr.ia.ac.cn, jonathanwu@tencent.com, hlbai@cse.cuhk.edu.hk
Pseudocode Yes A complete algorithm of M-NAS is as shown in Algorithm 1. Algorithm 1 M-NAS: Meta Neural Architecture Search
Open Source Code No The paper does not include an unambiguous statement or a direct link to the source code for the M-NAS methodology described in the paper. The only link provided (https://github.com/huaxiuyao/HSML) is for a baseline method, HSML.
Open Datasets Yes We applied our method to Mini Imagenet (Ravi and Larochelle 2017) datasets and a recently constructed benchmark Multi-datasets (Yao et al. 2019).
Dataset Splits No The paper describes how tasks are constructed with training and validation splits within each task (e.g., 'Dtr Ti' for training and 'Dval Ti' for validation within each task), and mentions N-way K-shot problems. However, it does not provide overall training/validation/test splits (e.g., in percentages or total counts) for the primary datasets (Mini Imagenet, Multi-datasets) as a whole for the meta-learning process.
Hardware Specification No The paper does not provide any specific hardware details such as CPU/GPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies No The paper states 'We implement M-NAS in Py Torch (Paszke et al. 2017).' but does not specify the version number for PyTorch or any other software dependencies.
Experiment Setup Yes For toy regression: 'The learning rates are ρinner = 0.001 and ρouter = 0.001 respectively. Specifically, we use smaller learning rate ρA = 1e-4 for architecture parameters WA. The balancing weight is set as β = 0.01 and fast adaptation for optimal weights estimation is carried out for 5 steps.' For few-shot classification: 'Fast weights estimation in Equation (8) is carried out for 5 steps with vanilla SGD while meta-updates and searching Equation (9) are performed with Adam. The learning rates are ρinner = 0.01 and ρouter = 0.001 respectively. Specifically, we use smaller learning rate ρA = 1e-4 for architecture parameters WA. The balancing weight is set as β = 0.01.'