reproducibilityindex.ai

Natural continual learning: success is a journey, not (just) a destination

Authors: Ta-Chu Kao, Kristopher Jensen, Gido van de Ven, Alberto Bernacchia, Guillaume Hennequin

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our method outperforms both standard weight regularization techniques and projection based approaches when applied to continual learning problems in feedforward and recurrent networks. We show that NCL outperforms previous continual learning algorithms in both feedforward and recurrent networks.
Researcher Affiliation	Collaboration	Ta-Chu Kao1* Kristopher T. Jensen1* Gido M. van de Ven1,2 Alberto Bernacchia3 Guillaume Hennequin1 1. Department of Engineering, University of Cambridge 2. Center for Neuroscience and Artificial Intelligence, Baylor College of Medicine 3. Media Tek Research, Cambridge
Pseudocode	Yes	The NCL algorithm is described in pseudocode in Appendix E together with additional implementation and computational details.
Open Source Code	Yes	Our code is available online1. 1https://github.com//tachukao/ncl
Open Datasets	Yes	To verify the utility of NCL for continual learning, we first compared our algorithm to standard methods in feedforward networks across two continual learning benchmarks: split MNIST and split CIFAR-100 (see Appendix B for task details). We thus considered an augmented version of the stroke MNIST dataset [SMNIST; 9].
Dataset Splits	No	For the split MNIST and split CIFAR-100 experiments, each baseline method had a single hyperparameter (c for SI, λ for EWC and KFAC, α for OWM, and pw for NCL; Appendix E) that was optimized on a held-out seed (see Appendix I.2). The paper mentions using 'a held-out seed' for hyperparameter optimization, which implies a validation set, but does not provide specific details on the split percentages or counts.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper mentions using Adam for optimization but does not provide specific version numbers for any software dependencies like programming languages or libraries.
Experiment Setup	Yes	For the split MNIST and split CIFAR-100 experiments, each baseline method had a single hyperparameter (c for SI, λ for EWC and KFAC, α for OWM, and pw for NCL; Appendix E) that was optimized on a held-out seed (see Appendix I.2). However, for our experiments in RNNs, we instead fix pw = 1 and perform a hyperparameter optimization over α for a more direct comparison with OWM and DOWM.