BiRT: Bio-inspired Replay in Vision Transformers for Continual Learning

Authors: Kishaan Jeeveswaran, Prashant Shivaram Bhat, Bahram Zonooz, Elahe Arani

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Table 1. Results on multiple datasets learned with 10 tasks with varying buffer sizes, averaged over multiple class orders. Bi RT achieves consistent improvements over Dy Tox in different metrics, i.e. accuracy, forgetting, BWT, and FWT. The last accuracy determines the performance on past tasks after learning the last task, and the average accuracy shows the average of the last accuracy after learning every task.
Researcher Affiliation Collaboration 1Advanced Research Lab, Nav Info Europe, Netherlands 2Dep. of Mathematics and Computer Science, Eindhoven University of Technology, Netherlands.
Pseudocode Yes Algorithm 1 Bi RT Algorithm
Open Source Code Yes 1Code available at github.com/NeurAI-Lab/BiRT.
Open Datasets Yes We evaluate our approach on CIFAR-100 (Krizhevsky et al., 2009), Image Net-100 (Deng et al., 2009), and Tiny Image Net (Le and Yang, 2015).
Dataset Splits Yes Image Net-100 consists of 129k train and 5,000 validation images of size 224x224 belonging to 100 classes.
Hardware Specification Yes All models are trained on a single NVIDIA V100 GPU, and all evaluations are performed on a single NVIDIA RTX 2080 Ti GPU.
Software Dependencies No The paper mentions using the 'continuum library (Douillard and Lesort, 2021)' and building on 'Dy Tox (Douillard et al., 2021) framework', but it does not specify version numbers for these or other software components like Python, PyTorch, or TensorFlow.
Experiment Setup Yes We train models with a learning rate of 5e-4, a batch size of 128, and a weight decay of 1e-6. All models, including the baseline, are trained for 500 epochs per task in CIFAR-100 (Krizhevsky et al., 2009), Tiny Image Net (Le and Yang, 2015), and Image Net-100 (Deng et al., 2009).