reproducibilityindex.ai

Overcoming Catastrophic Forgetting by Neuron-Level Plasticity Control

Authors: Inyoung Paik, Sangjun Oh, Taeyeong Kwak, Injung Kim5339-5346

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results on the several datasets show that neuron-level consolidation is substantially more effective compared to connection-level consolidation approaches.
Researcher Affiliation	Collaboration	Inyoung Paik, Sangjun Oh, Taeyeong Kwak Deep Bio Inc., Seoul, Republic of Korea {iypaik, tykwak}@deepbio.com, me@juneoh.net Injung Kim Handong Global University, Pohang, Republic of Korea ijkim@handong.edu
Pseudocode	Yes	Algorithm 1 Neuron-level Plasticity Control (NPC)
Open Source Code	No	The paper does not provide any concrete statement or link regarding the public availability of its source code.
Open Datasets	Yes	We experimented on an incremental version of MNIST(Le Cun et al. 1998) and CIFAR100(Krizhevsky and Hinton 2009) datasets, where the datasets containing X classes were divided into K subsets of X/K classes, each of which is classiﬁed by the k-th task. We set K to 5 for MNIST and 10 for CIFAR100. For preprocessing, we applied random cropping with padding size of 4 for both datasets. We also applied random horizontal ﬂip for the incremental CIFAR100 (i CIFAR100) dataset. Additionally, we experimented on sequential tasks with heterogeneous datasets, which is composed of MNIST(Le Cun et al. 1998), fashion-MNIST(f MNIST)(Xiao, Rasul, and Vollgraf 2017), EMNIST(balanced dataset)(Cohen et al. 2017), and small NORB (Le Cun 2004).
Dataset Splits	No	The paper mentions using 'validation accuracy' for tuning hyperparameters and in figures, but it does not provide specific details on the dataset splits (e.g., percentages or sample counts) used for training, validation, and testing within each task.
Hardware Specification	Yes	All experiments were performed on a server with 2 NVIDIA Tesla P40 GPUs.
Software Dependencies	No	The paper mentions various algorithms and components like CNN, Instance Normalization, and SGD, but it does not specify any software dependencies with their version numbers (e.g., PyTorch version, Python version, or specific library versions).
Experiment Setup	Yes	We used a simple CNN with 3 convolutional layers with (128, 512, 256) channels, and 2 fully connected layers with (512, number of classes) nodes. Each convolutional layer consists of convolution, Instance normalization, ReLU activation, and (2,2) max pooling. Dropout(Srivastava et al. 2014) of rate 0.2 is applied between two fully connected layers. The cross-entropy loss for each task was computed from only the output nodes belonging to the current task. For consistency, we redeﬁned the unit of one epoch in all experiments as the cycle in which the total number of train data was seen. ... we trained the models for 30 epochs on each task. As a result, we used αNPC = 0.05, βNPC = 0.5, δNPC = 1e-4, λEWC = 900, λMAS = 3.0, λSI = 0.08, λSSL = 2e-6. In a baseline experiment, we used L2 regularization with λ = 1e-4. We heuristically set ηmax = 0.1 for all experiments.