Summarizing Stream Data for Memory-Constrained Online Continual Learning
Authors: Jianyang Gu, Kai Wang, Wei Jiang, Yang You
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted on multiple online continual learning benchmarks to support that the proposed SSD method significantly enhances the replay effects. We demonstrate that with limited extra computational overhead, SSD provides more than 3% accuracy boost for sequential CIFAR-100 under extremely restricted memory buffer. |
| Researcher Affiliation | Academia | Jianyang Gu1,2, Kai Wang2, Wei Jiang1*, Yang You2 1Zhejiang University 2National University of Singapore {gu jianyang, jiangwei zju}@zju.edu.cn, {kai.wang, youy}@comp.nus.edu.sg |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code in https://github.com/vimar-gu/SSD. |
| Open Datasets | Yes | We evaluate our methods on three popular continual learning benchmarks. Sequential CIFAR-100 splits the CIFAR-100 dataset into 10 tasks, each with 10 non-overlapping classes (Krizhevsky, Hinton et al. 2009). Sequential Mini Image Net splits the Mini-Image Net dataset into 10 tasks, and each task contains 10 classes (Vinyals et al. 2016). Sequential Tiny-Image Net splits the Tiny-Image Net dataset into 20 tasks, each of which consists of 10 independent classes (Deng et al. 2009). |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning into training, validation, and test sets. It mentions 'Sequential CIFAR-100 splits the CIFAR-100 dataset into 10 tasks' but not the internal splits for training/validation/testing within these tasks or for the overall experiment. |
| Hardware Specification | No | The paper mentions 'GPU memory consumption' and discusses 'computational overhead' but does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions models like ResNet-18, optimizers like SGD, and refers to baseline methods, but it does not provide specific software dependency details such as library names with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1) needed to replicate the experiment. |
| Experiment Setup | Yes | We set the memory size to contain 1, 5 and 10 images per class. An SGD optimizer is adopted for parameter updating, with the learning rate set as 0.1. The training objective L t( ; ) is set as the standard cross entropy loss. The distance metric D is set as the euclidean distance. The summarizing interval τ is set as 6 and the similarity matching coefficient γ is set as 1. |