Continual Learning with Bayesian Neural Networks for Non-Stationary Data

Authors: Richard Kurle, Botond Cseke, Alexej Klushyn, Patrick van der Smagt, Stephan Günnemann

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results show that our update method improves on existing approaches for streaming data. Additionally, the adaptation methods lead to better predictive performance for non-stationary data. We compare our sequential update method to VCL in the online-inference setting on several popular datasets, demonstrating that our approach is favorable. Furthermore, we validate our adaptation methods on several datasets with concept drift (Widmer & Kubat, 1996), showing performance improvements compared to online variational Bayes without adaptation.
Researcher Affiliation Collaboration 1Volkswagen Group 2Technical University of Munich
Pseudocode Yes Appendix G: PSEUDO-ALGORITHM... Algorithm 1 Gaussian Residual Scoring with Bayesian forgetting.
Open Source Code No The paper does not provide any explicit statement or link indicating the public release of the source code for the described methodology.
Open Datasets Yes We evaluate our running memory (Sec. 3) in an online learning setting... regression (UCI Boston, UCI Concrete, UCI Energy, UCI Yacht) and classification (MNIST, UCI Spam, UCI Wine) tasks. We also evaluate our adaptation methods quantitatively on 3 datasets with concept drift (Weather, Gas Sensor Array Drift, Covertype).
Dataset Splits No For evaluation, we use a random held-out test dataset (20% of the data). We perform each experiment with 16 different random data splits and random seeds for the model parameter initialisation.
Hardware Specification No No specific hardware (e.g., CPU, GPU models, or cloud computing instances) used for running the experiments is mentioned in the paper.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments.
Experiment Setup Yes In Tab. 2, we summarise experimental setup (hyperparameters) used for Secs. 6.1 and 6.2. Table 2: Experiment setup for online experiments. Nt0 is the number of observed samples at the first time-step and Nt1:k is the dataset size of all other time-steps. M refers to the number of samples in the memory. Ktrain and Kterm is the number of MC samples used for training and for estimating the Gaussian terms respectively. It0 and It1:K refer to the number of iterations for the first time-step and all subsequent time-steps, respectively. The architecture denotes the number of units for each hidden layer.