Lifelong Learning with Dynamically Expandable Networks

Authors: Jaehong Yoon, Eunho Yang, Jeongtae Lee, Sung Ju Hwang

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate DEN on multiple public datasets under lifelong learning scenarios, on which it not only significantly outperforms existing lifelong learning methods for deep networks, but also achieves the same level of performance as the batch counterparts with substantially fewer number of parameters.
Researcher Affiliation Collaboration KAIST1, Daejeon, South Korea, UNIST2, Ulsan, South Korea, AITrics3, Seoul, South Korea
Pseudocode Yes Algorithm 1 Incremental Learning of a Dynamically Expandable Network; Algorithm 2 Selective Retraining; Algorithm 3 Dynamic Network Expansion; Algorithm 4 Network Split/Duplication
Open Source Code No We will release our codes upon acceptance of our paper, for reproduction of the results.
Open Datasets Yes Datasets: 1) MNIST-Variation. This dataset consists of 62, 000 images... 2) CIFAR-100. This dataset consists of 60, 000 images of 100 generic object classes(Krizhevsky & Hinton, 2009). 3) AWA (Animals with Attributes). This dataset consists of 30, 475 images of 50 animals (Lampert et al., 2009).
Dataset Splits Yes MNIST-Variation: We use 1, 000/200/5, 000 images for train/val/test split for each class. ... AWA (Animals with Attributes): We use random splits of 30/30/30 images for training/validation/test.
Hardware Specification No The paper mentions 'GPU computation' in Figure 4(a) but does not specify any particular hardware components like CPU models, GPU models, or memory.
Software Dependencies No All models and algorithms are implemented using the Tensorflow (Abadi et al., 2016) library. No version number for TensorFlow is provided.
Experiment Setup Yes Feedforward networks: We use a two-layer network with 312-128 neurons with Re LU activations. 2) Convolutional networks. For experiments on the CIFAR-100 dataset, we use a modified version of Alex Net (Krizhevsky et al., 2012) that has five convolutional layers (64-128-256-256-128 depth with 5x5 filter size), and three fully-connected layers (384-192-100 neurons at each layer). ... Input: Datatset Dt, Thresholds τ, σ ... where µ is the regularization parameter of the element-wise ℓ1 norm for sparsity on W.