Generalized Variational Continual Learning

Authors: Noel Loo, Siddharth Swaroop, Richard E Turner

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, in Section 5 we test GVCL and GVCL with Fi LM layers on many standard benchmarks...
Researcher Affiliation Academia Noel Loo, Siddharth Swaroop & Richard E. Turner University of Cambridge {nl355,ss2163,ret26}@cam.ac.uk
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/yolky/gvcl
Open Datasets Yes It is derived from the HASYv2 dataset (Thoma, 2017)... For our Split-MNIST experiment, in addition to the standard 5 binary classification tasks for Split MNIST, we add 5 more binary classification tasks by taking characters from the KMNIST dataset (Clanuwat et al., 2018)... The popular Split-CIFAR dataset, introduced in Zenke et al. (2017)...
Dataset Splits Yes Early stopping based on the validation set was used. 10% of the training set was used as validation for these methods, and for Easy and Hard CHASY, 8 samples per class form the validation set (which are disjoint from the training samples or test samples).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions using a Github repository for HAT but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes Baseline MAP algorithms were trained with SGD with a decaying learning starting at 5e-2 with a maximum epochs of 200 per task... For VI models, we used Adam optimizer with a learning rate of 1e-4 for Split-MNIST and Mixture, and 1e-3 for Easy-CHASY, Hard-CHASY and Split-CIFAR... All experiments (both the baselines and VI methods) use a batch size of 64... Table 3: Best (selected) hyperparameters for continual learning experiments for various algorithms.