Mitigating Forgetting in Online Continual Learning with Neuron Calibration
Authors: Haiyan Yin, peng yang, Ping Li
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform extensive experiments to evaluate our method on four benchmark continual learning datasets. The results show that neuron calibration plays a vital role in improving online continual learning performance and our method could substantially improve the state-of-the-art performance on all the evaluated datasets. In this section, we demonstrate the empirical evaluation results on comparing our method with a number of closely related baselines under various experimental settings. |
| Researcher Affiliation | Industry | Haiyan Yin, Peng Yang, Ping Li Cognitive Computing Lab Baidu Research 10900 NE 8th St. Bellevue, WA 98004, USA {haiyanyin, pengyang01, liping11}@baidu.com |
| Pseudocode | Yes | Algorithm 1: Neural Calibration for online Continual Learning (NCCL) Algorithm |
| Open Source Code | No | The paper does not contain an explicit statement about open-sourcing the code or a link to a code repository. |
| Open Datasets | Yes | We consider the following four continual learning datasets as our evaluation testbeds: Permuted MNIST (denoted as p MNIST) [19], Split CIFAR [19], Split mini Image Net [6], and a continual learning benchmark created with the real world dataset Split CORe50 [18]. |
| Dataset Splits | No | The paper mentions training and testing datasets, but it does not provide specific details on a separate validation split (percentages, sample counts) used for hyperparameter tuning or model selection in its own experimental setup. The term "validation" is used when discussing other methods. |
| Hardware Specification | No | The paper mentions its implementation framework (Paddle Paddle) and model architectures (ResNet18-reduced, MLP), but it does not specify any particular GPU or CPU models, or other hardware specifications used for running the experiments. |
| Software Dependencies | No | Our method is implemented using the Paddle Paddle (PArallel Distributed Deep LEarning) framework. The paper mentions Paddle Paddle but does not provide a specific version number or list other software dependencies with version numbers. |
| Experiment Setup | No | We present the detailed hyperparameter settings for all the baseline methods as well as NCCL on all the datasets in appendix. This indicates that the details are not provided in the main text. |