reproducibilityindex.ai

Scalable Recollections for Continual Lifelong Learning

Authors: Matthew Riemer, Tim Klinger, Djallel Bouneffouf, Michele Franceschini1352-1359

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present a novel scalable architecture and training algorithm in this challenging domain and provide an extensive evaluation of its performance. Our results show that we can achieve considerable gains on top of state-of-the-art methods such as GEM.
Researcher Affiliation	Industry	Matthew Riemer, Tim Klinger, Djallel Bouneffouf, Michele Franceschini IBM Research T.J. Watson Research Center, Yorktown Heights, NY {mdriemer, tklinger, djallel.bouneffouf, franceschini}@us.ibm.com
Pseudocode	Yes	Algorithm 1 Experience Replay Training for Continual Learning with a Scalable Recollection Module
Open Source Code	No	1See an extended version of this paper including the appendix at https://arxiv.org/pdf/1711.06761.pdf. This link points to the paper itself on arXiv, not to source code.
Open Datasets	Yes	MNIST-Rotations: (Lopez-Paz and Ranzato 2017) A dataset with 20 tasks including 1,000 training examples for each task. Incremental CIFAR-100: (Lopez-Paz and Ranzato 2017) A continual learning split of the CIFAR-100 image classiﬁcation dataset considering each of the 20 course grained labels to be a task with 2,500 examples each. Omniglot: A character recognition dataset (Lake et al. 2011) in which we consider each of the 50 alphabets to be a task.
Dataset Splits	No	The paper does not explicitly provide specific train/validation/test dataset splits (e.g., percentages, exact counts) or mention a validation set.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	Architecture: We model our experiments after (Lopez Paz and Ranzato 2017) and use a Resnet-18 model as Fθ for CIFAR-100 and Omniglot as well as a two layer MLP with 200 hidden units for MNIST-Rotations. Across all of our experiments, our autoencoder models include three convolutional layers in the encoder and three deconvolutional layers in the decoder. Each convolutional layer has a kernel size of 5. As we vary the size of our categorical latent variable across experiments, we in turn model the number of ﬁlters in each convolutional layer to keep the number of hidden variables consistent at all intermediate layers of the network. Module hyperparameters: In our experiments we used a binary cross entropy loss for both ℓand ℓREC.