New Insights on Reducing Abrupt Representation Change in Online Continual Learning

Authors: Lucas Caccia, Rahaf Aljundi, Nader Asadi, Tinne Tuytelaars, Joelle Pineau, Eugene Belilovsky

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results show significant gains over strong baselines on standard continual learning benchmarks 1. We evaluate on Split CIFAR-10, Split CIFAR-100 and Split Mini Imagenet using the protocol and constraints from Aljundi et al. (2019a); Ji et al. (2020); Shim et al. (2020).
Researcher Affiliation Collaboration Lucas Caccia Mc Gill University, Mila Facebook AI Research Rahaf Aljundi Toyota Motor Europe Nader Asadi Concordia University, Mila Tinne Tuytelaars KU Leuven Joelle Pineau Mc Gill University, Mila Facebook AI Research Eugene Belilovsky Concordia University, Mila
Pseudocode Yes Algorithm 1: ER-AML
Open Source Code Yes 1Code to reproduce experiments is available at www.github.com/pclucas14/AML
Open Datasets Yes Split CIFAR-10 partitions the dataset into 5 disjoint tasks containing two classes each (as in Aljundi et al. (2019a); Shim et al. (2020)) Split CIFAR-100 comprises 20 tasks, each containing a disjoint set of 5 labels. We follow the split in Chaudhry et al. (2019). All CIFAR experiments process 32 32 images. Split Mini Imagenet splits the Mini Imagenet dataset into 20 disjoint tasks of 5 labels each. Images are 84 84.
Dataset Splits Yes For all datasets considered, we withhold 5 % of the training data for validation.
Hardware Specification No We acknowledge resources provided by Compute Canada and Calcul Quebec.
Software Dependencies No The paper states that a 'detailed codebase' is provided but does not list specific software dependencies with version numbers within the text.
Experiment Setup Yes we use a reduced Resnet-18 for our experiments, and leave the batch size and the rehearsal batch size fixed at 10. For all datasets considered, we withhold 5 % of the training data for validation. For each method a grid search was ran on the possible hparams, which we detail below. DER++ Buzzega et al. (2020) : LR : [0.1, 0.01, 0.001] α : [0.25, 0.5, 0.75] β : [0.5, 0.75, 1]