A Theoretical Study on Solving Continual Learning
Authors: Gyuhak Kim, Changnan Xiao, Tatsuya Konishi, Zixuan Ke, Bing Liu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Based on the theoretical result, new CIL methods are also designed, which outperform strong baselines in both CIL and TIL settings by a large margin.4 |
| Researcher Affiliation | Collaboration | Gyuhak Kim 1, Changnan Xiao 2, Tatsuya Konishi 3, Zixuan Ke1, Bing Liu 1 1 University of Illinois at Chicago 2 Byte Dance 3 KDDI Research |
| Pseudocode | No | The paper includes theoretical formulations and proofs but does not present any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/k-gyuhak/WPTP |
| Open Datasets | Yes | Four popular benchmark image classification datasets are used, from which six CIL problems are created following recent papers [25, 34, 26]. (1) MNIST... (2) CIFAR-10... (3) CIFAR-100... (4) Tiny-Image Net... |
| Dataset Splits | Yes | For the replay methods, we use memory buffer 200 for MNIST and CIFAR-10 and 2000 for CIFAR-100 and Tiny-Image Net as in [29, 34]. We use the hyper-parameters suggested by the authors. If we could not reproduce any result, we use 10% of the training data as a validation set to grid-search for good hyper-parameters. |
| Hardware Specification | Yes | All experiments are conducted on 8 NVIDIA V100 GPUs. |
| Software Dependencies | No | The paper describes the model architectures used (Alex Net-like, Res Net-18) and mentions using PyTorch in their Appendix (e.g. 'We use the PyTorch library...'), but it does not provide specific version numbers for any software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | For the backbone structure, we follow [4, 26, 34]. Alex Net-like architecture [73] is used for MNIST and Res Net-18 [74] is used for CIFAR-10. For CIFAR-100 and Tiny-Image Net, Res Net-18 is also used as CIFAR-10, but the number of channels are doubled to fit more classes. ... For the replay methods, we use memory buffer 200 for MNIST and CIFAR-10 and 2000 for CIFAR-100 and Tiny-Image Net as in [29, 34]. We use the hyper-parameters suggested by the authors. If we could not reproduce any result, we use 10% of the training data as a validation set to grid-search for good hyper-parameters. For our proposed methods, we report the hyper-parameters in Appendix G. All the results are averages over 5 runs with random seeds. |