FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning

Authors: Dipam Goswami, Yuyang Liu, Bartłomiej Twardowski, Joost van de Weijer

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiments, 4.1 Experimental Setup, 4.2 Experimental Results, Table 1: Average top-1 incremental accuracy in exemplar-free many-shot CIL with different numbers of incremental tasks.
Researcher Affiliation Academia Dipam Goswami1,2 Yuyang Liu3,4,5, Bartłomiej Twardowski 1,2,6 Joost van de Weijer1,2 1Department of Computer Science, Universitat Autònoma de Barcelona 2Computer Vision Center, Barcelona 3University of Chinese Academy of Sciences 4State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences 5Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences 6IDEAS-NCBR
Pseudocode No The paper describes its methods using prose and mathematical equations but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/dipamgoswami/Fe CAM.
Open Datasets Yes We conduct experiments on three publicly available datasets: 1) CIFAR100 [26]... 2) Tiny Image Net [28]... 3) Image Net-Subset [12]... 1) CIFAR100 (described above); 2) mini Image Net [58]... 3) Caltech-UCSD Birds-200-2011 (CUB200) [59]
Dataset Splits No The paper specifies training and testing sample counts for datasets (e.g., 'CIFAR100 [26] consisting of 100 classes, 32 32 pixel images with 500 and 100 images per class for training and testing, respectively'), and describes how classes are distributed for incremental tasks. However, it does not explicitly specify a separate validation dataset split with percentages or counts needed for full reproducibility across all experiments.
Hardware Specification Yes Using one Nvidia RTX 6000 GPU, Fe Tr IL takes 44 minutes to complete all the new tasks while Fe CAM takes only 6 minutes.
Software Dependencies No The paper states 'We use Py CIL [73] framework for our experiments.' but does not provide specific version numbers for Py CIL or any other software dependencies, such as Python, PyTorch, or CUDA.
Experiment Setup Yes We use Py CIL [73] framework for our experiments. For both MSCIL and FSCIL settings, the main network architecture is Res Net-18 [18] trained on the first task using SGD with an initial learning rate of 0.1 and a weight decay of 0.0001 for 200 epochs. For the shrinkage, we use γ1 = 1 and γ2 = 1 for many-shot CIL and higher values γ1 = 100 and γ2 = 100 for few-shot CIL in our experiments. Following most methods, we store all the class prototypes. Similar to [78], we also store the covariance matrices for all classes seen until the current task. In the experiments with visual transformers, we use Vi T-B/16 [15] architecture pretrained on Image Net-21k [52]. The extracted features are 512 dimensional when using Resnet-18 and 768 dimensional when using pretrained Vi T. More implementation details for all hyperparameters are provided in the supplementary material. In our experiments, we use λ = 0.5 following [67].