reproducibilityindex.ai

Continual Learning through Retrieval and Imagination

Authors: Zhen Wang, Liu Liu, Yiqun Duan, Dacheng Tao8594-8602

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that DRI performs significantly better than the existing state-of-the-art continual learning methods and effectively alleviates catastrophic forgetting. 5 Experiments 5.1 Experimental Setup We consider a strict evaluation setting (Hsu et al. 2018), which models the sequence of tasks following three scenarios: Task Incremental Learning (Task-IL) splits the training samples into partitions of tasks, which requires task identities to select corresponding classifiers at inference time; Class Incremental Learning (Class-IL) sequentially increases the number of classes to be classified without requiring the task identities, as the hardest scenario (van de Ven et al. 2018); Domain Incremental Learning (Domain-IL) observes the same classes during each task, but the input-distribution is continuously changing; task identities remains unknown. Datasets. We experiment with the following datasets: Split MNIST: the MNIST benchmark (Le Cun et al. 1998) is split into 5 tasks by grouping together 2 classes. Split CIFAR-10: splitting CIFAR-10 (Krizhevsky et al. 2009) in 5 tasks, each of which introduces 2 classes. Split Tiny-Image Net: Tiny-Image Net (Stanford 2015) has 100,000 images across 200 classes. Each task consists of 20 disjoint subset of classes from these 200 classes.
Researcher Affiliation	Collaboration	Zhen Wang1, Liu Liu1, Yiqun Duan2, Dacheng Tao3,1 1The University of Sydney, Australia, 2University of Technology Sydney, Australia, 3JD Explore Academy, China zwan4121@uni.sydney.edu.au, liu.liu1@sydney.edu.au, yiqun.duan@student.uts.edu.au, dacheng.tao@gmail.com
Pseudocode	Yes	Algorithm 1: Deep Retrieval and Imagination (DRI) Input: continuum dataset D, memory capacity K Require: parameters θ, IGAN, scalars α and β, learning rate η M {} Initialize memory with empty set for t = 1, ..., T do θpre θ for (x, y) in Dt do (x , y ) sample(M) (x a, y a) (IGANg(x ), y ) (x , y ) α fθ(x a) fθpre(x a) 2 2 + β ℓ(θ; x a, y a) (xb, yb) rebalance((x, y), (x a, y a)) θ θ η θ[ℓ(θ; xb, yb) + ] Section 3.2 end for IGAN update IGAN(IGAN; Dt, M) Section 3.3 M update Memory(M; Dt, θ, K) Eq. (8) end for
Open Source Code	No	The paper does not contain an explicit statement about releasing code or a link to a code repository.
Open Datasets	Yes	Datasets. We experiment with the following datasets: Split MNIST: the MNIST benchmark (Le Cun et al. 1998) is split into 5 tasks by grouping together 2 classes. Split CIFAR-10: splitting CIFAR-10 (Krizhevsky et al. 2009) in 5 tasks, each of which introduces 2 classes. Split Tiny-Image Net: Tiny-Image Net (Stanford 2015) has 100,000 images across 200 classes. Each task consists of 20 disjoint subset of classes from these 200 classes.
Dataset Splits	Yes	We select the hyper-parameters by performing a grid search on the validation set which is obtained by sampling 10% of the training set.
Hardware Specification	No	The paper does not provide specific details about the hardware used for experiments (e.g., GPU models, CPU specifications).
Software Dependencies	No	The paper mentions using "the stochastic gradient descent (SGD) optimizer" and "Res Net18 (He et al. 2016)" but does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	No	The paper mentions hyperparameters are selected via grid search on the validation set, and that models are trained with SGD, but does not explicitly list specific values for learning rate, batch size, epochs, or other detailed training configurations in the main text.