Multiband VAE: Latent Space Alignment for Knowledge Consolidation in Continual Learning
Authors: Kamil Deja, Paweł Wawrzyński, Wojciech Masarczyk, Daniel Marczak, Tomasz Trzciński
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On top of the standard continual learning benchmarks, we propose a novel challenging knowledge consolidation scenario and show that the proposed approach outperforms state-of-the-art by up to twofold across all experiments and the additional real-life evaluation. |
| Researcher Affiliation | Collaboration | Kamil Deja1 , Paweł Wawrzy nski1 , Wojciech Masarczyk1 , Daniel Marczak1 and Tomasz Trzci nski1,2,3 1Warsaw University of Technology, 2Jagiellonian University, 3Tooploox |
| Pseudocode | No | The paper describes the method in prose and with diagrams, but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The exact architecture and training hyperparameters are enlisted in the appendix and code repository1. 1https://github.com/KamilDeja/multiband_vae |
| Open Datasets | Yes | To assess the quality of our method, we conduct a series of experiments on benchmarks commonly used in continual learning (MNIST, Omniglot [Lake et al., 2015]) and generative modeling Fashion MNIST [Xiao et al., 2017]. Since the performance of VAE on diverse datasets like CIFAR is limited, in order to evaluate how our method scales to more complex data, we include tests on Celeb A [Liu et al., 2015]. |
| Dataset Splits | No | The paper describes how data is split into tasks for continual learning scenarios (e.g., using Dirichlet distribution) but does not provide explicit train, validation, and test dataset split percentages or counts for model training. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., Python 3.8, PyTorch 1.9) needed to replicate the experiment. |
| Experiment Setup | No | The paper states that 'The exact architecture and training hyperparameters are enlisted in the appendix and code repository1.', but these details are not provided directly in the main text. |