Task-Aware Information Routing from Common Representation Space in Lifelong Learning

Authors: Prashant Shivaram Bhat, Bahram Zonooz, Elahe Arani

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that our method outperforms state-of-the-art rehearsal-based and dynamic sparse approaches and bridges the gap between fixed capacity and parameter isolation approaches while being scalable.
Researcher Affiliation Collaboration Prashant Bhat1, Bahram Zonooz1,2 & Elahe Arani1,2 1Advanced Research Lab, Nav Info Europe, Netherlands 2Dep. of Mathematics and Computer Science, Eindhoven University of Technology, Netherlands
Pseudocode Yes Algorithm 1 Proposed Method
Open Source Code Yes 1Code is available at: https://github.com/Neur AI-Lab/TAMi L
Open Datasets Yes We obtain Seq-CIFAR10, Seq-CIFAR100 and Seq-Tiny Image Net by splitting CIFAR10 (Krizhevsky et al., 2009), CIFAR100 (Krizhevsky et al., 2009) and Tiny Image Net (Le & Yang, 2015)
Dataset Splits No The paper describes how base datasets (CIFAR10, CIFAR100, Tiny Image Net) are split into sequential tasks, but it does not explicitly provide details about train/validation/test dataset splits (percentages or counts) for reproduction. It does mention 'Class-Incremental Learning (Class-IL) and Task-Incremental Learning (Task-IL)' scenarios, which imply certain data handling but without explicit split ratios.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies No We build on top of the Mammoth (Buzzega et al., 2020) CL repository in Py Torch.
Experiment Setup No The paper describes details such as model backbone (Res Net-18), TAM architecture (undercomplete autoencoders with specific dimensions), and the general training regime (sequential tasks, experience rehearsal, EMA model updates). However, it does not explicitly provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) in the main text or appendices provided.