Layerwise Proximal Replay: A Proximal Point Method for Online Continual Learning
Authors: Jinsoo Yoo, Yunpeng Liu, Frank Wood, Geoff Pleiss
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we empirically investigate LPR s effect on how a neural network learns during online continual learning. In addition, we extensively evaluate LPR across three online continual learning problems on top of four state-of-the-art experience replay methods. |
| Researcher Affiliation | Collaboration | 1University of British Columbia 2Inverted AI 3Mila 4Vector Institute. |
| Pseudocode | Yes | Algorithm 1 Layerwise Proximal Replay (LPR) |
| Open Source Code | Yes | The code is available at https://github.com/plai-group/LPR. |
| Open Datasets | Yes | For online class-incremental learning, we evaluate on the online versions of Split-CIFAR100 and Split Tiny Image Net datasets (Soutif-Cormerais et al., 2023). ... For online domain-incremental learning, we evaluate on the online version of the CLEAR dataset (Lin et al., 2021). |
| Dataset Splits | No | The paper mentions using a "validation set" for metrics like AAA and WC-Acc, but does not specify the exact split percentages, sample counts, or the methodology for creating these splits (e.g., "80/10/10 split" or a specific citation for predefined splits). |
| Hardware Specification | No | The paper mentions using "computational resources provided by the Digital Research Alliance of Canada Compute Canada (alliancecan.ca), the Advanced Research Computing at the University of British Columbia (arc.ubc.ca), and Amazon." This does not provide specific hardware details like GPU/CPU models or memory. |
| Software Dependencies | No | The paper mentions building on "Avalanche continual learning framework (Carta et al., 2023a)" but does not specify its version number or any other software dependencies with their respective versions. |
| Experiment Setup | Yes | For each data batch, we take 3, 9, and 10 gradient steps respectively for Split-CIFAR100, Split Tiny Image Net, and Online CLEAR. ... For all baseline methods on all datasets, we searched across their learning rates between {0.01, 0.05, 0.1}. ... For all LPR runs on all datasets, we searched across ω0 between {0.04, 0.25, 1., 4., 100.} and β between {1., 2.}. The preconditioner update interval was set to T = 10 for all experiments. |