Gradient Episodic Memory for Continual Learning

Authors: David Lopez-Paz, Marc'Aurelio Ranzato

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on variants of the MNIST and CIFAR-100 datasets demonstrate the strong performance of GEM when compared to the state-of-the-art.
Researcher Affiliation Industry David Lopez-Paz and Marc Aurelio Ranzato Facebook Artificial Intelligence Research {dlp,ranzato}@fb.com
Pseudocode Yes Algorithm 1 summarizes the training and evaluation protocol of GEM over a continuum of data.
Open Source Code Yes Our source code is available at https://github.com/ facebookresearch/Gradient Episodic Memory.
Open Datasets Yes MNIST Permutations [Kirkpatrick et al., 2017], a variant of the MNIST dataset of handwritten digits [Le Cun et al., 1998], where each task is transformed by a fixed permutation of pixels.
Dataset Splits No The paper does not provide specific details on training/validation/test dataset splits, such as percentages, sample counts, or explicit splitting methodologies. It only refers to a 'test set' for evaluation.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper mentions using 'plain SGD' and 'Res Net18' but does not specify any software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup Yes On the MNIST tasks, we use fully-connected neural networks with two hidden layers of 100 Re LU units. On the CIFAR100 tasks, we use a smaller version of Res Net18 [He et al., 2015], with three times less feature maps across all layers. Also on CIFAR100, the network has a final linear classifier per task. We train all the networks and baselines using plain SGD on mini-batches of 10 samples.