Sliced Cramer Synaptic Consolidation for Preserving Deeply Learned Representations

Authors: Soheil Kolouri, Nicholas A. Ketz, Andrea Soltoggio, Praveen K. Pilly

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose the sliced Cram er distance as a suitable choice for such preservation and evaluate our Sliced Cram er Preservation (SCP) algorithm through extensive empirical investigations on various network architectures in both supervised and unsupervised learning settings. We show that SCP consistently utilizes the learning capacity of the network better than online-EWC and MAS methods on various incremental learning tasks.
Researcher Affiliation Collaboration Soheil Kolouri, Nicholas A. Ketz, & Praveen K. Pilly HRL Laboratories, LLC Malibu, CA, 91301, USA {skolouri, naketz, pkpilly}@hrl.com Andrea Soltoggio School of Computer Science, Loughborough University, Leicestershire, UK a.soltoggio@lboro.ac.uk
Pseudocode Yes Algorithm 1: Sliced Cramer Preservation (SCP)
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets Yes We first test our proposed algorithm on the benchmark permuted MNIST task...Next, we consider an experiment consisting of unsupervised/self-supervised sequential learning...learn an auto-encoder on single digits of the MNIST dataset sequentially...We specifically consider the problem of learning semantic segmentation of road scenes...use two sequences of the SYNTHIA dataset (Ros et al., 2016)...The CIFAR100 (Krizhevsky et al., 2009) dataset
Dataset Splits No The paper describes training and test sets but does not explicitly specify validation dataset splits with percentages or counts.
Hardware Specification No The paper does not explicitly state the specific hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies No The paper mentions the use of 'ADAM optimizer (Kingma & Ba, 2014)' but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes For this experiment, we used a fully-connected network (i.e., a multi-layer perceptron) with the following architecture, 784 1024 512 256 10 neurons, and for all optimizations we used the ADAM optimizer with learning rate, lr = 1e 4. For our proposed method, SCP, we used L = 100 slices. The model is chosen to be a fully connected auto-encoder, with the following encoder 728 1024 1024 1024 256, a mirrored decoder 256 1024 1024 1024 784, and Rectified Linear Unit (Re LU) activations. For the loss function, we used cross-entropy plus the ℓ1-norm of the reconstruction error... We perform 50 epochs of learning on each digit...each task was learned over 100 epochs.