Addressing Catastrophic Forgetting in Few-Shot Problems

Authors: Pauching Yap, Hippolyt Ritter, David Barber

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental evaluations demonstrate that our framework can effectively achieve this goal in comparison with various baselines. 6. Experiments We implement BOMLA and BOMVI1 to the 5-way 1-shot triathlon and pentathlon sequences.
Researcher Affiliation Academia 1Department of Computer Science, University College London, London, United Kingdom 2Alan Turing Institute, London, United Kingdom.
Pseudocode Yes The pseudo-code of the BOMLA algorithm can be found in Appendix B.1. The pseudo-code of the BOMVI algorithm can be found in Appendix B.1.
Open Source Code Yes 1Implementation code is available at https://github. com/pauchingyap/boml
Open Datasets Yes Some popular examples of the few-shot classification datasets are Omniglot (Lake et al., 2011), CIFAR-FS (Bertinetto et al., 2019) and mini Image Net (Vinyals et al., 2016). We run the sequential tasks experiment on the Omniglot dataset.
Dataset Splits Yes The N-way K-shot task, for instance, refers to sampling N classes and using K examples per class for few-shot quick adaptation. The experimental details and the datasets explanations are in Appendix C.1. A newly-arrived Dt+1 is separated into the base class set Dt+1 and novel class set b Dt+1 for metatraining and meta-evaluation respectively.
Hardware Specification No The paper does not provide specific details about the hardware used for the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions the use of MAML and other algorithms, but it does not provide specific version numbers for any software dependencies (e.g., Python version, library versions like PyTorch or TensorFlow).
Experiment Setup Yes We implement BOMLA and BOMVI1 to the 5-way 1-shot triathlon and pentathlon sequences. BOMLA with λ = 100 gives good performance in the off-diagonal plots... Meta-evaluation accuracy across 3 seed runs on each dataset along meta-training. Each iteration of the MAML algorithm samples M tasks from the base class set D and runs a few steps of stochastic gradient descent (SGD) for an inner loop task-specific learning.