Posterior Meta-Replay for Continual Learning
Authors: Christian Henning, Maria Cervera, Francesco D'Angelo, Johannes von Oswald, Regina Traber, Benjamin Ehret, Seijin Kobayashi, Benjamin F. Grewe, João Sacramento
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on standard benchmarks show that our probabilistic hypernetworks compress sequences of posterior parameter distributions with virtually no forgetting. In this section, we start by illustrating the conceptual advantage of the Posterior Replay approach compared to Prior Focused methods, as well as the importance of parameter uncertainty. We then explore scalability to more challenging computer vision CL benchmarks. |
| Researcher Affiliation | Academia | Institute of Neuroinformatics University of Zürich and ETH Zürich Zürich, Switzerland {henningc,mariacer}@ethz.ch |
| Pseudocode | No | The paper describes methods in narrative text and figures (e.g., Fig. 2) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Source code for all experiments (including all baselines) is available at: https://github.com/chrhenning/posterior_replay_cl . |
| Open Datasets | Yes | To investigate the factors that affect uncertainty-based task inference, we next consider Split MNIST [96], a popular variant of the MNIST dataset, adapted to CL by splitting the ten digit classes into five binary classification tasks. Here we show that our approach scales to natural images by considering Split CIFAR-10 [39], a dataset consisting of five tasks with two classes each. |
| Dataset Splits | No | The paper mentions using Split MNIST and Split CIFAR-10 datasets, but defers details to supplementary materials ('all experimental details can be found in SM E.2') and does not explicitly provide specific percentages, sample counts, or explicit references for train/validation/test splits within the main text. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory, or cloud instances) used to run its experiments. |
| Software Dependencies | No | The paper does not provide a reproducible description of ancillary software, as it does not list specific software names with version numbers. |
| Experiment Setup | No | The paper defers all specific experimental setup details, including hyperparameters and training settings, to supplementary materials stating 'all experimental details can be found in SM E.2'. |