Continual Learning with Recursive Gradient Optimization
Authors: Hao Liu, Huaping Liu
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that RGO has significantly better performance on popular continual classification benchmarks when compared to the baselines and achieves new state-of-the-art performance on 20-split-CIFAR100 (82.22%) and 20-split-mini Image Net (72.63%). |
| Researcher Affiliation | Academia | Hao Liu Department of Computer Science Tsinghua University Beijing, China hao-liu20@mails.tsinghua.edu.cn Huaping Liu Department of Computer Science Tsinghua University Beijing, China hpliu@tsinghua.edu.cn |
| Pseudocode | Yes | Algorithm 1 Learning Algorithm of Recursive Gradient Optimization |
| Open Source Code | Yes | We give the reproducible source code in the supplementary materials, and introduce the implementation of the baseline method in Appendix C.1. |
| Open Datasets | Yes | Permuted MNIST (Goodfellow et al., 2014; Kirkpatrick et al., 2017) and Rotated MNIST (Chaudhry et al., 2020) are variants of MNIST dataset of handwritten digits (Le Cun, 1998)... Split-CIFAR100 (Zenke et al., 2017)... Split mini Image Net, introduced by (Chaudhry et al., 2020), applies a similar division on a subset of the original Image Net (Russakovsky et al., 2015) dataset. |
| Dataset Splits | No | The paper mentions training, testing, and task divisions but does not explicitly detail a separate validation dataset split. |
| Hardware Specification | Yes | All experiments of our method are completed in several hours with 4 pieces of Nvidia-2080Ti GPUs. |
| Software Dependencies | Yes | In Python3.6 and Tensor Flow1.4, all results can be reproduced. |
| Experiment Setup | Yes | Architectures and training details: ... MNIST variants are trained 1000 steps while CIFAR and mini Image Net are trained 2000 steps. Batchsize is set at 10 for all tasks. ... The learning rates of all baselines are generated by hyperparameter search in [0.003,0.01,0.03,0.1,0.3,1]... Recursive Gradient Optimization(Ours) learningrate: 0.1(MNIST), 0.03(CIFAR100, mini Image Net 2000steps), 0.01(mini Image Net 20epochs). All experiments are trained 5 times with 20 epoches. Learning rate is set at 0.03 and 0.01 for CIFAR and mini Image Net respectively. |