Prospective Representation Learning for Non-Exemplar Class-Incremental Learning
Authors: Wuxuan Shi, Mang Ye
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on four benchmarks suggest the superior performance of our approach over the state-of-the-art. We also provide a detailed analys of our method. |
| Researcher Affiliation | Academia | Wuxuan Shi1, Mang Ye1,2 1School of Computer Science, Wuhan University, Wuhan, China 2 Taikang Center for Life and Medical Sciences, Wuhan University, Wuhan, China |
| Pseudocode | Yes | Algorithm 1 Proposed Method |
| Open Source Code | Yes | https://github.com/Shi Wuxuan/Neur IPS2024-PRL |
| Open Datasets | Yes | We conduct comprehensive experiments on four public datasets: CIFAR-100 [62], Tiny Image Net [63], Image Net-Subset and Image Net-1K [64]. |
| Dataset Splits | Yes | For Image Net-1K, the learning rate starts at 0.0005 for all phases. The learning rate decays to 1/10 of the previous value every 70 epochs (160 epochs in total) in the base phase and every 45 epochs (100 epochs in total) in each incremental phase. |
| Hardware Specification | Yes | We conduct our experiments on an RTX4090 GPU. |
| Software Dependencies | No | Our method is implemented with Py CIL [65]. While Py CIL is mentioned, specific version numbers for this or other critical libraries/frameworks (like Python, PyTorch/TensorFlow) are not provided. |
| Experiment Setup | Yes | The batch size is set to 64 for CIFAR-100 and Tiny Image Net and 128 for Image Net-Subset and Image Net-1K. During training, the model is optimized by the Adam optimizer with β1 = 0.9, β2 = 0.999 and ϵ = 1e 8 (weight decay 2e-4). For Image Net-1K, the learning rate starts at 0.0005 for all phases. The learning rate decays to 1/10 of the previous value every 70 epochs (160 epochs in total) in the base phase and every 45 epochs (100 epochs in total) in each incremental phase. For other datasets, the learning rate starts from 0.001 and decays to 1/10 of the previous value every 45 epochs (100 epochs in total) for all phases. We use λ = 0.5 and γ = 0.1 for all datasets. Regarding the loss weights, for comprehensive performance considerations and with reference to previous studies [6; 51], we set α1 = 10, α2 = 10, and α3 = 2 for training. |