Multiband VAE: Latent Space Alignment for Knowledge Consolidation in Continual Learning

Authors: Kamil Deja, Paweł Wawrzyński, Wojciech Masarczyk, Daniel Marczak, Tomasz Trzciński

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On top of the standard continual learning benchmarks, we propose a novel challenging knowledge consolidation scenario and show that the proposed approach outperforms state-of-the-art by up to twofold across all experiments and the additional real-life evaluation.
Researcher Affiliation Collaboration Kamil Deja1 , Paweł Wawrzy nski1 , Wojciech Masarczyk1 , Daniel Marczak1 and Tomasz Trzci nski1,2,3 1Warsaw University of Technology, 2Jagiellonian University, 3Tooploox
Pseudocode No The paper describes the method in prose and with diagrams, but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes The exact architecture and training hyperparameters are enlisted in the appendix and code repository1. 1https://github.com/KamilDeja/multiband_vae
Open Datasets Yes To assess the quality of our method, we conduct a series of experiments on benchmarks commonly used in continual learning (MNIST, Omniglot [Lake et al., 2015]) and generative modeling Fashion MNIST [Xiao et al., 2017]. Since the performance of VAE on diverse datasets like CIFAR is limited, in order to evaluate how our method scales to more complex data, we include tests on Celeb A [Liu et al., 2015].
Dataset Splits No The paper describes how data is split into tasks for continual learning scenarios (e.g., using Dirichlet distribution) but does not provide explicit train, validation, and test dataset split percentages or counts for model training.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., Python 3.8, PyTorch 1.9) needed to replicate the experiment.
Experiment Setup No The paper states that 'The exact architecture and training hyperparameters are enlisted in the appendix and code repository1.', but these details are not provided directly in the main text.