Dynamic Ensemble of Low-Fidelity Experts: Mitigating NAS “Cold-Start”

Authors: Junbo Zhao, Xuefei Ning, Enshu Liu, Binxin Ru, Zixuan Zhou, Tianchen Zhao, Chen Chen, Jiajin Zhang, Qingmin Liao, Yu Wang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments across five search spaces with different architecture encoders under various experimental settings. For example, our methods can improve the Kendall s Tau correlation coefficient between actual performance and predicted scores from 0.2549 to 0.7064 with only 25 actual architecture-performance data on NDS-Res Net.
Researcher Affiliation Collaboration 1Department of Electronic Engineering, Tsinghua University 2Huawei Technologies Co., Ltd 3Tsinghua Shenzhen International Graduate School 4Sail Yond Technology & Research Institute of Tsinghua University in Shenzhen
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks with explicit labels like 'Pseudocode' or 'Algorithm'.
Open Source Code Yes Our method will be implemented in Mindspore (Huawei 2020), and the example code is published at https://github.com/A-Lin Cui/DELE.
Open Datasets Yes We conduct experiments on the five search spaces for a thorough evaluation: NAS-Bench-201, NAS-Bench-301, NDS-Res Net / Res Ne Xt-A and Mobile Net-V3. We divide architectures into a training and validation split for each space. All architectures in the training split with different types of low-fidelity information are used in the first training step. The detailed search space description, data split, types and acquisition of the utilized low-fidelity information are elaborated in the appendix.
Dataset Splits Yes We divide architectures into a training and validation split for each space. We train the predictors on the former and test their prediction ability on the latter. All architectures in the training split are used for pretraining, while the first 1% percentages by index with corresponding actual performance are used for finetuning. (Table 1 footnote). Proportions of training samples: 1% 5% 10% 50% 100% (Table 2 header).
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory). It mentions 'GPU hours' in reference to another method, but not for their own experimental setup.
Software Dependencies No The paper states: 'Our method will be implemented in Mindspore (Huawei 2020)'. While Mindspore is a software framework, the specific version (2020) is too general, and no other specific software dependencies with version numbers are mentioned.
Experiment Setup Yes Following the previous studies (Ning et al. 2020; Xu et al. 2021), we train predictors with the hinge pair-wise ranking loss with margin m = 0.1. We first train different low-fidelity experts for 200 epochs and then finetune the dynamic ensemble performance predictor on the actual performance data for 200 epochs. An Adam optimizer with learning rate 1e-3 is applied for optimization. The batch sizes used for NAS-Bench-201, NAS-Bench-301, NDS and Mobile Net V3 search spaces are 512, 128, 128 and 512, respectively. We set N0 = 20, M = 7813, Tp = 5, Tpe = 50, Np = 5, π = 20, µ = 5 and K = 100 for our method (in the NAS-Bench-201 section).