Online Structured Laplace Approximations for Overcoming Catastrophic Forgetting
Authors: Hippolyt Ritter, Aleksandar Botev, David Barber
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our algorithm achieves over 90% test accuracy across a sequence of 50 instantiations of the permuted MNIST dataset, substantially outperforming related methods for overcoming catastrophic forgetting. |
| Researcher Affiliation | Collaboration | Hippolyt Ritter1 Aleksandar Botev1 David Barber1,2,3 1University College London 2Alan Turing Institute 3reinfer.io |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our fork with code to calculate the Kronecker factors is available at: www.github.com/BB-UCL/Lasagne |
| Open Datasets | Yes | As a first experiment, we test on a sequence of permutations of the MNIST dataset [19]. |
| Dataset Splits | No | The paper mentions that 'Fig. 1 shows the mean test accuracy as new datasets are observed for the optimal hyperparameters of each method' and refers to 'the value that optimizes the validation error', confirming the use of validation. However, it does not provide specific dataset split percentages, sample counts, or citations to predefined splits for the train/validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper states: 'We implement our experiments using Theano [39] and Lasagne [3] software libraries.' However, it does not specify version numbers for these software libraries, which is required for a reproducible description. |
| Experiment Setup | Yes | For the permuted MNIST dataset, we used the Adam [15] optimizer with a learning rate of 0.001 and mini-batches of size 128. For the disjoint MNIST and vision datasets, we used Nesterov momentum [28, 32] with a learning rate of 0.1, a momentum of 0.9, and mini-batches of size 250. We trained each task for 200 epochs. |