BNS: Building Network Structures Dynamically for Continual Learning
Authors: Qi Qin, Wenpeng Hu, Han Peng, Dongyan Zhao, Bing Liu
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on five datasets MNIST, CIFAR10, CIFAR-100, F-EMNIST and F-Celeba show that BNS markedly outperforms the existing state-of-the-art Task-CL baselines. |
| Researcher Affiliation | Academia | 1 Center for Data Science, AAIS, Peking University 2 Department of Information Science, School of Mathematical Sciences, Peking University 3 Wangxuan Institute of Computer Technology, Peking University 4 Department of Computer Science, University of Illinois at Chicago {qinqi, phan, wenpeng.hu, zhaody}@pku.edu.cn, liub@uic.edu |
| Pseudocode | No | The paper describes the BNS algorithm using text and figures, but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code of BNS can be found at: https://github.com/lalalaup6/BNS |
| Open Datasets | Yes | We evaluate the proposed BNS3 on five image classification datasets, three standard benchmarks MNIST [28], CIFAR10 and CIFAR-100 [26] and two additional datasets F-EMNIST and FCeleba [21] |
| Dataset Splits | Yes | We use ten percent of the training data of each task as the validation set for reward computation. |
| Hardware Specification | Yes | For all experiments, we use a single Nvidia RTX 2080Ti GPU. |
| Software Dependencies | No | In this paper, we use Res Net18 pre-trained on Image Net in Pytorch to calculate the task similarity. |
| Experiment Setup | Yes | For BNS, we use SGD as the optimizer with learning rate 0.1 to train the continual learner f( , θt) except F-EMNIST and F-Celeba (learning rate 0.01). The parameters of the agent (LSTM) is updated by the Adam optimizer using the learning rate 0.0001. The hyperparameters η and β are set to 0.001 and 0.003 respectively. To be consistent with the baseline settings, our continual learner trains 100 epochs for all datasets except MNIST (10 epochs). |