Elastic Feature Consolidation For Cold Start Exemplar-Free Incremental Learning

Authors: Simone Magistri, Tomaso Trinci, Albin Soutif, Joost van de Weijer, Andrew D. Bagdanov

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on CIFAR-100, Tiny-Image Net, Image Net-Subset and Image Net-1K demonstrate that Elastic Feature Consolidation is better able to learn new tasks by maintaining model plasticity and significantly outperform the state-of-the-art.
Researcher Affiliation Academia Department of Information Engineering, University of Florence, Italy1 Computer Vision Center, Universitat Autònoma de Barcelona, Spain2
Pseudocode Yes In Appendix D we provide the pseudocode of the overall training procedure. (Algorithm 1: Elastic Feature Consolidation)
Open Source Code Yes Code to reproduce experiments is available at https://github.com/simomagi/elastic_ feature_consolidation
Open Datasets Yes We consider three standard datasets: CIFAR-100 (Krizhevsky et al., 2009),Tiny-Image Net (Wu et al., 2017) and Image Net-Subset (Deng et al., 2009). Each is evaluated in two settings.
Dataset Splits No The paper specifies the train/test split for CIFAR-100 (500 images for training and 100 for testing per class), but does not explicitly state a specific validation set split (percentages or sample counts) for any of the datasets used.
Hardware Specification No The paper does not explicitly state the specific hardware (e.g., GPU models, CPU models, or cloud computing instances with specifications) used for running the experiments.
Software Dependencies No The paper mentions using Adam optimizer and ResNet-18 backbone, but does not specify software dependencies with version numbers like Python, PyTorch, or CUDA versions.
Experiment Setup Yes For the incremental steps of EFC we used Adam with weight decay of 2e-4 and fixed learning rate of 1e-4 for Tiny-Image Net and CIFAR-100, while for Image Net-Subset we use a learning rate of 1e-5 for the backbone and 1e-4 for the heads. We fixed the total number of epochs to 100 and use a batch size of 64. We set λEFM = 10 and η = 0.1 in Eq. 9 for all the experiments.