reproducibilityindex.ai

Continual Learning with Adaptive Weights (CLAW)

Authors: Tameem Adel, Han Zhao, Richard E. Turner

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that CLAW achieves state-of-the-art performance on six benchmarks in terms of overall continual learning performance, as measured by classiﬁcation accuracy, and in terms of addressing catastrophic forgetting.
Researcher Affiliation	Collaboration	Tameem Adel Department of Engineering, University of Cambridge tah47@cam.ac.uk; Han Zhao Carnegie Mellon University han.zhao@cs.cmu.edu; Richard E. Turner Department of Engineering, University of Cambridge Microsoft Research ret26@cam.ac.uk
Pseudocode	Yes	Algorithm 1 Continual Learning with Adaptive Weights (CLAW)
Open Source Code	No	The paper does not include any explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	The datasets in use are: MNIST (Le Cun et al., 1998), not MNIST (Butalov, 2011), Fashion-MNIST (Xiao et al., 2017), Omniglot (Lake et al., 2011) and CIFAR-100 (Krizhevsky & Hinton, 2009).
Dataset Splits	Yes	Data is randomly split into three partitions, training, validation and test. A portion of 60% of the data is reserved for training, 20% for validation and 20% for testing.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types. The mention of an 'Nvidia GPU grant' in acknowledgments is too general.
Software Dependencies	No	The paper mentions 'Adam' as an optimizer but does not specify any software versions for programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	The minibatch size is 128 for Split MNIST and 256 for all the other experiments. Adam (Kingma & Ba, 2015) is the optimiser used in the 6 experiments with η = 0.001, β1 = 0.9 and β2 = 0.999. Number of epochs required per task to reach a saturation level for CLAW (and the bulk of the methods in comparison) was 10 epochs for all experiments except for Omniglot and CIFAR-100 (15 epochs). Used values of ω1 and ω2 are 0.05 and 0.02, respectively. For Omniglot, we used a network similar to the one used in (Schwarz et al., 2018), which consists of 4 blocks of 3 3 convolutions with 64 ﬁlters, followed by a Re LU and a 2 2 max-pooling. The same CNN is used for CIFAR-100.