reproducibilityindex.ai

Recasting Continual Learning as Sequence Modeling

Authors: Soochan Lee, Jaehyeon Son, Gunhee Kim

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments on seven benchmarks, covering both classification and regression, show that sequence models can be an attractive solution for general MCL.
Researcher Affiliation	Academia	Soochan Lee Seoul National University soochan.lee@vision.snu.ac.kr Jaehyeon Son Seoul National University sjh9876@snu.ac.kr Gunhee Kim Seoul National University gunhee@snu.ac.kr
Pseudocode	Yes	Algorithm 1 Inner loop of conventional SGD-based MCL
Open Source Code	Yes	Code is available at https://github.com/soochan-lee/cl-as-seq
Open Datasets	Yes	CIFAR-100 [18]. Omniglot [19]. CASIA Chinese Handwriting Database (CASIA; 22). MS-Celeb-1M [10].
Dataset Splits	No	The paper states: 'The tasks are then split into two disjoint sets, one for meta-training and the other for meta-testing.' It does not explicitly mention a separate validation set or split for hyperparameter tuning, distinct from the meta-training and meta-test sets.
Hardware Specification	Yes	We compare various aspects of the computational cost using our PyTorch [27] implementation on NVIDIA A40 GPUs which have 48 GB of VRAM.
Software Dependencies	No	The paper mentions 'PyTorch [27] implementation' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	By default, we set K = 20, while additionally testing the K = 100 setting to compare performances with longer episodes. For each task k, the training stream Dtrain k and the test set Dtest k contain five examples each (i.e., five shots). For each experiment, we meta-train for 50K steps with a batch size of 16 (i.e., 16 episodes in parallel) and meta-test with 1,024 episodes. All the models share a similar architecture: 4 layers, 8 heads, and 512 hidden dimensions.