Task-recency bias strikes back: Adapting covariances in Exemplar-Free Class Incremental Learning
Authors: Grzegorz Rypeść, Sebastian Cygert, Tomasz Trzcinski, Bartłomiej Twardowski
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Ada Gauss yields state-of-the-art results on popular EFCIL benchmarks and datasets when training from scratch or starting from a pre-trained backbone. |
| Researcher Affiliation | Collaboration | Grzegorz Rype s c IDEAS NCBR Warsaw University of Technology grzegorz.rypesc@ideas-ncbr.pl Sebastian Cygert IDEAS NCBR Gda nsk University of Technology sebastian.cygert@ideas-ncbr.pl Tomasz Trzci nski IDEAS NCBR Warsaw University of Technology Tooploox Bartłomiej Twardowski IDEAS NCBR Autonomous University of Barcelona Computer Vision Center |
| Pseudocode | Yes | Algorithm 1 Ada Gauss: Adapting Gaussians in EFCIL |
| Open Source Code | Yes | Code: https://github.com/grypesc/Ada Gauss |
| Open Datasets | Yes | We evaluate our method on several well-established benchmark datasets. CIFAR100 [19] consists of 50k training and 10k testing images in resolution 32x32. Tiny Image Net [20], a subset of Image Net [8], has 100k training and 10k testing images in 64x64 resolution. Imagenet Subset contains 100 classes from Image Net (ILSVRC 2012) [34]. |
| Dataset Splits | Yes | To allow covariance matrices to be invertible, we add a shrink value of 0.5, similarly to [10]. Intuitively, increasing the shrink value decreases the method s efficacy, as it artificially alters the covariance to be different from the ground truth representation. |
| Hardware Specification | Yes | We utilize a single machine with an NVIDIA RTX4080 graphics card to run experiments. The time for execution of a single experiment varied depending on the dataset type, but it was at most ten hours. We attach details of utilized hyperparameters in scripts in the code repository. We report all results as the mean and variance of five runs. ... We measure the training and inference time of popular EFCIL methods using their original implementations on a single machine with NVIDIA Ge Force RTX 4060 and AMD Ryzen 5 5600X CPU. |
| Software Dependencies | No | The paper mentions software frameworks like FACIL and PyCIL and the use of ResNet18, but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We set λ = 10, N = 10000, d = 32 and add a single linear bottleneck layer at the end of the F with S output dimensions, which define the latent space. When training from scratch, we set S = 64, while for fine-grained datasets, we decrease it to 32, as there are fewer examples per class. We use an SGD optimizer running for 200 epochs with a weight decay equal to 0.0005. When training from scratch, we utilize a starting learning rate (lr) of 0.1, decreased by ten times after 60, 120, and 180 epochs. We train the adapter using an SGD optimizer with weight decay of 0.0005, running for 100 epochs with a starting lr of 0.01; we decrease it ten times after 45 and 90 epochs. |