Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning

Authors: Mohamed Elsayed, A. Rupam Mahmood

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that many existing methods suffer from at least one of the issues, predominantly manifested by their decreasing accuracy over tasks. On the other hand, UPGD continues to improve performance and surpasses or is competitive with all methods in all problems. Finally, in extended reinforcement learning experiments with PPO, we show that while Adam exhibits a performance drop after initial learning, UPGD avoids it by addressing both continual learning issues.1
Researcher Affiliation Academia Mohamed Elsayed Department of Computing Science University of Alberta Alberta Machine Intelligence Institute mohamedelsayed@ualberta.ca A. Rupam Mahmood Department of Computing Science University of Alberta CIFAR Canada AI Chair, Amii armahmood@ualberta.ca
Pseudocode Yes Algorithm 1 UPGD
Open Source Code Yes 1Code is available at https://github.com/mohmdelsayed/upgd
Open Datasets Yes For the latter, we use non-stationary streaming problems based on MNIST (Le Cun et al. 1998), EMNIST (Cohen et al. 2017), CIFAR-10 (Krizhevsky 2009), and Image Net (Deng et al. 2009) datasets
Dataset Splits No The paper primarily describes an online streaming learning setup and mentions a "held-out set" for an offline variation, but it does not provide explicit or reproducible train/validation/test split percentages or sample counts for the main experiments. While it states a hyperparameter search was conducted to "maximize the area under the online accuracy curve," the specific validation split used for this is not detailed.
Hardware Specification No The paper mentions obtaining computational resources from the "Digital Research Alliance of Canada" in the acknowledgement section, but it does not specify any particular hardware components such as GPU or CPU models, memory, or specific server configurations used for running the experiments.
Software Dependencies No The paper mentions using "PPO algorithm" and "Clean RL implementation for PPO" and setting hyperparameters like β1, β2, and ϵ for Adam, but it does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes In each of the following experiments, a thorough hyperparameter search is conducted (see Appendix I). Our criterion was to find the best set of hyperparameters for each method that maximizes the area under the online accuracy curve. Unless stated otherwise, we averaged the performance of each method over 20 independent runs. We focus on the key results here and give the full experimental details in Appendix I.