AdaXpert: Adapting Neural Architecture for Growing Data

Authors: Shuaicheng Niu, Jiaxiang Wu, Guanghui Xu, Yifan Zhang, Yong Guo, Peilin Zhao, Peng Wang, Mingkui Tan

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on two growth scenarios (increasing data volume and number of classes) demonstrate the effectiveness of the proposed method.
Researcher Affiliation Collaboration 1School of Software Engineering, South China University of Technology, China 2Key Laboratory of Big Data and Intelligent Robot, Ministry of Education, China 3Tencent AI Lab, China 4National University of Singapore, Singapore 5Northwestern Polytechnical University, China 6Pazhou Laboratory, China.
Pseudocode Yes Algorithm 1 The overall algorithm of Ada Xpert. Input: Incoming datasets {Dnew t }T t=1; well-trained model α1 for Dnew 1 ; supernet N1 and controller π( ; θ1); threshold ϵ.
Open Source Code Yes Code is available at https://github.com/mr-eggplant/adaxpert0.
Open Datasets Yes We conduct our experiments on Image Net, a large-scale image classification dataset (Deng et al., 2009).
Dataset Splits Yes Split Dt into training and validation sets {Dtrain, Dval}.
Hardware Specification No The paper does not provide specific details about the hardware used for the experiments, such as GPU or CPU models.
Software Dependencies No The paper does not provide specific version numbers for ancillary software dependencies (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes Search space for architecture adaptation: Here, we consider the architecture space based on the inverted Mobile Block (Howard et al., 2019). To be specific, the model is divided into 5 units with gradually reduced feature map spatial size and increased number of channels. Each unit consists of 4 layers at most, where only the first layer has stride 2 if the feature map size decreases, and all the other layers have stride 1. In our experiments, we search for the number of layers in each unit (chosen from {2, 3, 4}), the kernel size in each layer (chosen from {3, 5, 7}), and the width expansion ratio in each layer (chosen from {3, 4, 6}).