A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning

Authors: Soochan Lee, Junsoo Ha, Dongsu Zhang, Gunhee Kim

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental With extensive experiments, we show that our model successfully performs task-free continual learning for both discriminative and generative tasks such as image classification and image generation. 1 INTRODUCTION...With several benchmark experiments of CL literature on MNIST, SVHN, and CIFAR 10/100, we show that our model successfully performs multiple types of CL tasks, including image classification and generation. 4 EXPERIMENTS We evaluate the proposed CN-DPM model in task-free CL with four benchmark datasets.
Researcher Affiliation Academia Soochan Lee, Junsoo Ha, Dongsu Zhang & Gunhee Kim Department of Computer Science, Seoul National University, Seoul, Republic of Korea {soochan.lee,junsoo.ha}@vision.snu.ac.kr,{96lives,gunhee}@snu.ac.kr http://vision.snu.ac.kr/projects/cn-dpm
Pseudocode Yes Algorithm 1 Training of the Continual Neural Dirichlet Process Mixture (CD-NDP) Model
Open Source Code No The paper provides a project URL (http://vision.snu.ac.kr/projects/cn-dpm) on the first page, but this is a general project page and not an explicit statement of code release or a direct link to a code repository for the methodology described in the paper.
Open Datasets Yes Split-MNIST (Zenke et al., 2017). The MNIST dataset (Le Cun et al., 1998)... MNIST-SVHN (Shin et al., 2017). ...SVHN (Netzer et al., 2011)... Split-CIFAR10 and Split-CIFAR100. In Split-CIFAR10, we split CIFAR10 (Krizhevsky & Hinton, 2009)...
Dataset Splits No The paper describes how datasets are split into tasks for continual learning scenarios (e.g., Split-MNIST, Split-CIFAR10), and mentions training sets, but does not provide specific train/validation/test splits with percentages, counts, or explicit methodology for a validation set.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions software components like 'Adam optimizer' and 'Re LU activation' but does not specify version numbers for any programming languages or libraries (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes C.3 EXPERIMENTAL DETAILS We use the classifier temperature parameter of 0.01 for Split-MNIST, Split-CIFAR10/100... Weight decay 0.00001 has been used... Gradients are clipped by value with a threshold of 0.5. All the CN-DPM models are trained by Adam optimizer. During the sleep phase, we train the new expert for multiple epochs with a batch size of 50. ... The learning rate of 0.0001 and 0.0004 has been used for the classifier and VAE of each expert in the classification task.