A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning
Authors: Soochan Lee, Junsoo Ha, Dongsu Zhang, Gunhee Kim
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | With extensive experiments, we show that our model successfully performs task-free continual learning for both discriminative and generative tasks such as image classification and image generation. 1 INTRODUCTION...With several benchmark experiments of CL literature on MNIST, SVHN, and CIFAR 10/100, we show that our model successfully performs multiple types of CL tasks, including image classification and generation. 4 EXPERIMENTS We evaluate the proposed CN-DPM model in task-free CL with four benchmark datasets. |
| Researcher Affiliation | Academia | Soochan Lee, Junsoo Ha, Dongsu Zhang & Gunhee Kim Department of Computer Science, Seoul National University, Seoul, Republic of Korea {soochan.lee,junsoo.ha}@vision.snu.ac.kr,{96lives,gunhee}@snu.ac.kr http://vision.snu.ac.kr/projects/cn-dpm |
| Pseudocode | Yes | Algorithm 1 Training of the Continual Neural Dirichlet Process Mixture (CD-NDP) Model |
| Open Source Code | No | The paper provides a project URL (http://vision.snu.ac.kr/projects/cn-dpm) on the first page, but this is a general project page and not an explicit statement of code release or a direct link to a code repository for the methodology described in the paper. |
| Open Datasets | Yes | Split-MNIST (Zenke et al., 2017). The MNIST dataset (Le Cun et al., 1998)... MNIST-SVHN (Shin et al., 2017). ...SVHN (Netzer et al., 2011)... Split-CIFAR10 and Split-CIFAR100. In Split-CIFAR10, we split CIFAR10 (Krizhevsky & Hinton, 2009)... |
| Dataset Splits | No | The paper describes how datasets are split into tasks for continual learning scenarios (e.g., Split-MNIST, Split-CIFAR10), and mentions training sets, but does not provide specific train/validation/test splits with percentages, counts, or explicit methodology for a validation set. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions software components like 'Adam optimizer' and 'Re LU activation' but does not specify version numbers for any programming languages or libraries (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | C.3 EXPERIMENTAL DETAILS We use the classifier temperature parameter of 0.01 for Split-MNIST, Split-CIFAR10/100... Weight decay 0.00001 has been used... Gradients are clipped by value with a threshold of 0.5. All the CN-DPM models are trained by Adam optimizer. During the sleep phase, we train the new expert for multiple epochs with a batch size of 50. ... The learning rate of 0.0001 and 0.0004 has been used for the classifier and VAE of each expert in the classification task. |