reproducibilityindex.ai

Selfless Sequential Learning

Authors: Rahaf Aljundi, Marcus Rohrbach, Tinne Tuytelaars

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper we look at a scenario with ﬁxed model capacity, and postulate that the learning process should not be selﬁsh, i.e. it should account for future tasks to be added and thus leave enough capacity for them. To achieve Selﬂess Sequential Learning we study different regularization strategies and activation functions. We ﬁnd that imposing sparsity at the level of the representation (i.e. neuron activations) is more beneﬁcial for sequential learning than encouraging parameter sparsity. In particular, we propose a novel regularizer, that encourages representation sparsity by means of neural inhibition. It results in few active neurons which in turn leaves more free neurons to be utilized by upcoming tasks. As neural inhibition over an entire layer can be too drastic, especially for complex tasks requiring strong representations, our regularizer only inhibits other neurons in a local neighbourhood, inspired by lateral inhibition processes in the brain. We combine our novel regularizer with state-of-the-art lifelong learning methods that penalize changes to important previously learned parts of the network. We show that our new regularizer leads to increased sparsity which translates in consistent performance improvement on diverse datasets.
Researcher Affiliation	Collaboration	Rahaf Aljundi KU Leuven ESAT-PSI, Belgium rahaf.aljundi@gmail.com Marcus Rohrbach Facebook AI Research mrf@fb.com Tinne Tuytelaars KU Leuven ESAT-PSI, Belgium tinne.tuytelaars@esat.kuleuven.be
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not provide any statement about open-sourcing its code or a link to a code repository.
Open Datasets	Yes	We use the MNIST dataset (Le Cun et al., 1998) as a ﬁrst task in a sequence of 5 tasks, where we randomly permute all the input pixels differently for tasks 2 to 5. ... For this we split the CIFAR-100 and the Tiny Image Net (Yao & Miller, 2015) dataset into ten tasks, respectively. ... The 8 tasks sequence is composed of: 1. Oxford Flowers (Nilsback & Zisserman, 2008), 2. MIT Scenes (Quattoni & Torralba, 2009), 3. Caltech-UCSD Birds (Welinder et al., 2010), 4. Stanford Cars (Krause et al., 2013); 5. FGVC-Aircraft (Maji et al., 2013); 6. VOC Actions (Everingham et al.); 7. Letters (de Campos et al., 2009); and 8. SVHN (Netzer et al., 2011) datasets.
Dataset Splits	No	The paper mentions using training data and evaluating on test accuracy but does not explicitly describe the dataset splits (e.g., train/validation/test percentages or counts) for its experiments. It refers to 'test accuracy' but not a distinct validation set or its split details.
Hardware Specification	No	The paper does not specify any hardware details (e.g., GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions).
Experiment Setup	Yes	All tasks are trained for 10 epochs with a learning rate 10^-2 using SGD optimizer. Re LU is used as an activation function unless mentioned otherwise. ... We train the different tasks for 50 epochs with a learning rate of 10^-2 using SGD optimizer.