Bayesian Structural Adaptation for Continual Learning

Authors: Abhishek Kumar, Sunabha Chatterjee, Piyush Rai

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on supervised and unsupervised benchmarks demonstrate that our approach performs comparably or better than recent advances in continual learning. 5. Experiments We perform experiments on both supervised and unsupervised continual learning scenarios. We also evaluate our model on task-agnostic setup for unsupervised CL and compare our method with relevant state-of-the-art methods. In addition to the quantitative (accuracy/log-likelihood comparisons) and qualitative (generation) results, we also examine the network structures learned by our model.
Researcher Affiliation Collaboration 1Microsoft, India 2SAP Labs, India 3Department of Computer Science, IIT Kanpur, India.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes 1The code for our models can be found at this link: https: //github.com/npbcl/icml21
Open Datasets Yes We perform our evaluations on five supervised CL benchmarks: Split MNIST, Split not MNIST(small), Permuted MNIST, Split fashion MNIST and Split Cifar100. For MNIST, the tasks are sequence of single digit generation from 0 to 9. Similarily, for not MNIST each task is single character generation from A to J.
Dataset Splits No The paper mentions training and testing but does not provide specific details or percentages for a validation split, nor does it refer to predefined validation splits with citations.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or processing power) used for running experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup No The paper does not explicitly provide specific experimental setup details such as concrete hyperparameter values, training configurations, or system-level settings in the main text.