An Effective Dynamic Gradient Calibration Method for Continual Learning
Authors: Weichen Lin, Jiaxiang Chen, Ruomin Huang, Hu Ding
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also conduct a set of experiments on several benchmark datasets to evaluate the performance in practice. ... Finally, we conduct a set of comprehensive experiments on the popular datasets S-CIFAR10, SCIFAR100 and S-Tiny Image Net; the experimental results suggest that our method can improve the final Average Incremental Accuracy (FAIA) in several CL scenarios by more than 6%. |
| Researcher Affiliation | Academia | 1School of Data Science, University of Science and Technology of China, Anhui, China. 2Duke University 3School of Computer Science and Technology, University of Science and Technology of China, Anhui, China. |
| Pseudocode | Yes | Algorithm 1 DGC procedure Algorithm 2 DGC procedure in TFCL |
| Open Source Code | No | The paper refers to implementations of *other* methods (Mammoth, Py CIL, Google-research) that are open-source but does not explicitly state that its *own* proposed method's code is open-source or provide a link for it. |
| Open Datasets | Yes | We carry out the experiments on three widely employed datasets S(Split)-CIFAR10, S-CIFAR100 (Krizhevsky et al., 2009), and S-Tiny Image Net (Le & Yang, 2015). |
| Dataset Splits | Yes | For the hyperparameter selection of different methods, we directly use the original hyperparameters used in these open-source frameworks, which are obtained by using grid search on 10% of the training set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using ResNet18 as the base network and refers to implementations from Mammoth, Py CIL, and Google-research. However, it does not provide specific version numbers for software dependencies like Python, PyTorch, CUDA, or the frameworks themselves. |
| Experiment Setup | Yes | In our experiments, we set the value m = 200, so the value of s in Algorithm 1 is also determined. ... To fairly compare the methods with constant storage limits, we uniformly train for 50 epochs in each task on S-CIFAR10 and S-CIFAR100, and 100 epochs in each task on S-Tiny Image Net. The batch size is all set to be 32. ... According to the experimental description in the Dynamic ER method, we train the first task for 200 epochs on all datasets and train all subsequent tasks for 170 epochs; the batch size is set to be 128. ... In all the experiments of Section 4, we fix the value of α to be 1e 3. ... The results in Table 4 show that when m is greater than 100, DGC-ER can significantly improve the performance of ER. |