reproducibilityindex.ai

Beyond Not-Forgetting: Continual Learning with Backward Knowledge Transfer

Authors: Sen Lin, Li Yang, Deliang Fan, Junshan Zhang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental studies show that CUBER can even achieve positive backward knowledge transfer on several existing CL benchmarks for the first time without data replay, where the related baselines still suffer from catastrophic forgetting (negative backward knowledge transfer). The superior performance of CUBER on the backward knowledge transfer also leads to higher accuracy accordingly.
Researcher Affiliation	Academia	Sen Lin School of ECEE Arizona State University slin70@asu.edu Li Yang School of ECEE Arizona State University lyang166@asu.edu Deliang Fan School of ECEE Arizona State University dfan@asu.edu Junshan Zhang Department of ECE University of California, Davis jazh@ucdavis.edu
Pseudocode	Yes	Algorithm 1 Continual learning with backward knowledge transfer (CUBER)
Open Source Code	Yes	We include the code in the supplemental material.
Open Datasets	Yes	Datasets. We evaluate the performance of CUBER on four standard CL benchmarks. (1) Permuted MNIST: a variant of the MNIST dataset [14] where random permutations are applied to the input pixels. ... (2) Split CIFAR-100: we divide the CIFAR-100 dataset [13] ... (3) 5-Datasets: we consider a sequence of 5 datasets, i.e., CIFAR-10, MNIST, SVHN [21], not-MNIST[2], Fashion MNIST[28], and the classification problem on each dataset is a task. (4) Split Mini Image Net: we divide the Mini Image Net dataset [27]...
Dataset Splits	No	While the paper mentions "early termination based on the validation loss" indicating the use of a validation set, it does not provide specific details on the dataset splits (e.g., percentages or sample counts) for training, validation, and testing.
Hardware Specification	No	The main text of the paper, including the 'Network and training details' section, does not specify the hardware used for the experiments (e.g., specific GPU or CPU models). While the checklist states 'See the appendix' for this information, the appendix content is not provided.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., names of libraries or frameworks with their versions) used for the experiments.
Experiment Setup	Yes	For Permuted MNIST, we consider a 3-layer fully-connected network including 2 hidden layers with 100 units. And we train the network for 5 epochs with a batch size of 10 for every task. For Split CIFAR-100, we use a version of 5-layer Alex Net by following [25, 17]. When learning each task, we train the network for a maximum of 200 epochs with early termination based on the validation loss, and use a batch size of 64. ... In the experiments, we set ϵ1 = 0.5.