reproducibilityindex.ai

Residual Continual Learning

Authors: Janghyeon Lee, Donggyu Joo, Hyeong Gwon Hong, Junmo Kim4553-4560

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method for sequential learning of image classiﬁcation tasks and compare it with other methods, including ﬁne-tuning, Lw F, and Mean-IMM, that do not re-fer to any source task information for fair comparisons. Mode-IMM is not compared in the experiment because it requires the Fisher information matrix, which cannot be obtained without source data. The source and target tasks are to classify the CIFAR-10, CIFAR-100 (Krizhevsky 2009), or SVHN (Netzer et al. 2011) dataset. A pre-activation residual network of 32 layers without bottlenecks (He et al. 2016b) is used.
Researcher Affiliation	Academia	Janghyeon Lee,1 Donggyu Joo,1 Hyeong Gwon Hong,2 Junmo Kim1 1School of Electrical Engineering, KAIST 2Graduate School of AI, KAIST {wkdgus9305, jdg105, honggudrnjs, junmo.kim}@kaist.ac.kr
Pseudocode	Yes	Algorithm 1: Residual Continual Learning
Open Source Code	No	The paper does not contain any explicit statement about releasing open-source code or a link to a code repository.
Open Datasets	Yes	The source and target tasks are to classify the CIFAR-10, CIFAR-100 (Krizhevsky 2009), or SVHN (Netzer et al. 2011) dataset.
Dataset Splits	No	The paper mentions 'target validation data' but does not provide specific details on how the dataset was split into training, validation, and test sets (e.g., percentages or sample counts).
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models used for running its experiments.
Software Dependencies	No	The paper does not specify software dependencies with version numbers (e.g., specific deep learning frameworks or programming language versions).
Experiment Setup	Yes	For the CIFAR datasets, data augmentation and hyperparameter settings are the same as those in (He et al. 2016b). Training images are horizontally ﬂipped with a probability of 0.5 and randomly cropped to 32 32 from 40 40 zero-padded images during training. SGD with a momentum of 0.9, a minibatch size of 128, and a weight decay of λdec = 0.0001 optimizes networks until 64000 iterations. [...] The learning rate starts from 0.1 and is multiplied by 0.1 at 32000 and 48000 iterations. The He s initialization method (He et al. 2015) is used to initialize source networks. Combination parameters (αs, αt) in Res CL are initialized to ( 1/2 1, 1/2 1) in order to balance the original and new features at the early stage of training.