FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning
Authors: Dipam Goswami, Yuyang Liu, Bartłomiej Twardowski, Joost van de Weijer
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments, 4.1 Experimental Setup, 4.2 Experimental Results, Table 1: Average top-1 incremental accuracy in exemplar-free many-shot CIL with different numbers of incremental tasks. |
| Researcher Affiliation | Academia | Dipam Goswami1,2 Yuyang Liu3,4,5, Bartłomiej Twardowski 1,2,6 Joost van de Weijer1,2 1Department of Computer Science, Universitat Autònoma de Barcelona 2Computer Vision Center, Barcelona 3University of Chinese Academy of Sciences 4State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences 5Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences 6IDEAS-NCBR |
| Pseudocode | No | The paper describes its methods using prose and mathematical equations but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/dipamgoswami/Fe CAM. |
| Open Datasets | Yes | We conduct experiments on three publicly available datasets: 1) CIFAR100 [26]... 2) Tiny Image Net [28]... 3) Image Net-Subset [12]... 1) CIFAR100 (described above); 2) mini Image Net [58]... 3) Caltech-UCSD Birds-200-2011 (CUB200) [59] |
| Dataset Splits | No | The paper specifies training and testing sample counts for datasets (e.g., 'CIFAR100 [26] consisting of 100 classes, 32 32 pixel images with 500 and 100 images per class for training and testing, respectively'), and describes how classes are distributed for incremental tasks. However, it does not explicitly specify a separate validation dataset split with percentages or counts needed for full reproducibility across all experiments. |
| Hardware Specification | Yes | Using one Nvidia RTX 6000 GPU, Fe Tr IL takes 44 minutes to complete all the new tasks while Fe CAM takes only 6 minutes. |
| Software Dependencies | No | The paper states 'We use Py CIL [73] framework for our experiments.' but does not provide specific version numbers for Py CIL or any other software dependencies, such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | We use Py CIL [73] framework for our experiments. For both MSCIL and FSCIL settings, the main network architecture is Res Net-18 [18] trained on the first task using SGD with an initial learning rate of 0.1 and a weight decay of 0.0001 for 200 epochs. For the shrinkage, we use γ1 = 1 and γ2 = 1 for many-shot CIL and higher values γ1 = 100 and γ2 = 100 for few-shot CIL in our experiments. Following most methods, we store all the class prototypes. Similar to [78], we also store the covariance matrices for all classes seen until the current task. In the experiments with visual transformers, we use Vi T-B/16 [15] architecture pretrained on Image Net-21k [52]. The extracted features are 512 dimensional when using Resnet-18 and 768 dimensional when using pretrained Vi T. More implementation details for all hyperparameters are provided in the supplementary material. In our experiments, we use λ = 0.5 following [67]. |