Harnessing Neural Unit Dynamics for Effective and Scalable Class-Incremental Learning
Authors: Depeng Li, Tianqi Wang, Junwei Chen, Wei Dai, Zhigang Zeng
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our method achieves strong CIL performance in rehearsal-free and minimalexpansion settings with different backbones. and 5. Experiment |
| Researcher Affiliation | Academia | Depeng Li 1 Tianqi Wang 1 Junwei Chen 1 Wei Dai 2 Zhigang Zeng 1 1School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China 2School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China. Correspondence to: Zhigang Zeng <zgzeng@hust.edu.cn> |
| Pseudocode | Yes | We summarize its CIL procedure in Algorithm 1 in Appendix B. |
| Open Source Code | No | The paper does not provide a direct link to the source code or an explicit statement of code release for their own work. It only refers to 'original codebases' for baselines. |
| Open Datasets | Yes | We experiment on multiple datasets commonly used for CIL. Small Scale: Both MNIST (Le Cun et al., 1998) and Fashion MNIST (Xiao et al., 2017) are respectively split into 5 disjoint tasks with 2 classes per task. Medium Scale: CIFAR-100 (Krizhevsky et al., 2009) is divided into 10 (25) tasks with each task containing 10 (4) disjoint classes. Large Scale: Image Net-R (Hendrycks et al., 2021) |
| Dataset Splits | Yes | When conducting experiments with different datasets, we keep about 10% of the training data from each task for validation. |
| Hardware Specification | Yes | We run experiments on extensive datasets adapted for CIL under different widely used backbones, which are implemented in Py Torch with NVIDIA RTX 3080-Ti GPUs. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not provide a specific version number. No other software dependencies are listed with their version numbers. |
| Experiment Setup | Yes | We use the SGD optimizer with an initial learning rate (0.001 for MNIST, Fashion MNIST; 0.01 for the remaining)... In our method... we use Lmax(t) = 200 and R(t) = 99%; for CIFAR-100, we use Lmax(t) = 1000 and R(t) = 90% (CIFAR-100/10), and Lmax(t) = 500 and R(t) = 80% (CIFAR-100/25). Similarly, we empirically set the step size l = 10 for node expansion each time and the maximum times of random generation Tmax = 50... r(t) = 0.9 and µL(t) = 1 r(t) / (L+1). |