Prospective Representation Learning for Non-Exemplar Class-Incremental Learning

Authors: Wuxuan Shi, Mang Ye

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on four benchmarks suggest the superior performance of our approach over the state-of-the-art. We also provide a detailed analys of our method.
Researcher Affiliation Academia Wuxuan Shi1, Mang Ye1,2 1School of Computer Science, Wuhan University, Wuhan, China 2 Taikang Center for Life and Medical Sciences, Wuhan University, Wuhan, China
Pseudocode Yes Algorithm 1 Proposed Method
Open Source Code Yes https://github.com/Shi Wuxuan/Neur IPS2024-PRL
Open Datasets Yes We conduct comprehensive experiments on four public datasets: CIFAR-100 [62], Tiny Image Net [63], Image Net-Subset and Image Net-1K [64].
Dataset Splits Yes For Image Net-1K, the learning rate starts at 0.0005 for all phases. The learning rate decays to 1/10 of the previous value every 70 epochs (160 epochs in total) in the base phase and every 45 epochs (100 epochs in total) in each incremental phase.
Hardware Specification Yes We conduct our experiments on an RTX4090 GPU.
Software Dependencies No Our method is implemented with Py CIL [65]. While Py CIL is mentioned, specific version numbers for this or other critical libraries/frameworks (like Python, PyTorch/TensorFlow) are not provided.
Experiment Setup Yes The batch size is set to 64 for CIFAR-100 and Tiny Image Net and 128 for Image Net-Subset and Image Net-1K. During training, the model is optimized by the Adam optimizer with β1 = 0.9, β2 = 0.999 and ϵ = 1e 8 (weight decay 2e-4). For Image Net-1K, the learning rate starts at 0.0005 for all phases. The learning rate decays to 1/10 of the previous value every 70 epochs (160 epochs in total) in the base phase and every 45 epochs (100 epochs in total) in each incremental phase. For other datasets, the learning rate starts from 0.001 and decays to 1/10 of the previous value every 45 epochs (100 epochs in total) for all phases. We use λ = 0.5 and γ = 0.1 for all datasets. Regarding the loss weights, for comprehensive performance considerations and with reference to previous studies [6; 51], we set α1 = 10, α2 = 10, and α3 = 2 for training.