Progress & Compress: A scalable framework for continual learning

Authors: Jonathan Schwarz, Wojciech Czarnecki, Jelena Luketina, Agnieszka Grabska-Barwinska, Yee Whye Teh, Razvan Pascanu, Raia Hadsell

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the progress & compress approach on sequential classification of handwritten alphabets as well as two reinforcement learning domains: Atari games and 3D maze navigation. We now provide an assessment of the suitability of P&C as a continual learning method, conducting experiments to test against the desiderata introduced in Section 1. We introduce experiments varying in the nature of the learning task, their difficulty and the similarity between tasks. To evaluate P&C for supervised learning, we first consider the sequential learning of handwritten characters of 50 alphabets taken from the Omniglot dataset (Lake et al., 2015).
Researcher Affiliation Collaboration 1DeepMind, London, United Kingdom 2Department of Computer Science, University of Oxford, Oxford, United Kingdom. Correspondence to: Jonathan Schwarz <schwarzjn@google.com>, Razvan Pascanu <razp@google.com>.
Pseudocode No The paper describes the algorithm steps in paragraph form and through mathematical equations (e.g., Equation 1, 2, 3, 4, 5, 6, 7, 8, 9) but does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any statement about making its source code publicly available, nor does it provide a link to a code repository.
Open Datasets Yes To evaluate P&C for supervised learning, we first consider the sequential learning of handwritten characters of 50 alphabets taken from the Omniglot dataset (Lake et al., 2015). Assessing P&C under more challenging conditions, we also consider the sequential learning of 6 games in the Atari suite (Bellemare et al., 2012) (Space Invaders, Krull, Beamrider, Hero, Stargunner and Ms. Pac-man).
Dataset Splits No The paper mentions using "Omniglot dataset", "Atari suite", and "3D environments", and discusses "test performance" and "held-out maze", but it does not provide specific details on how these datasets were split into training, validation, and test sets (e.g., exact percentages or sample counts), nor does it reference a specific citation for predefined splits.
Hardware Specification No The paper mentions "distributed variant of the actor-critic architecture" and that "All RL results are obtained by running an identical experiment with 4 random seeds" but does not specify any particular hardware components such as GPU or CPU models.
Software Dependencies No The paper mentions "neural networks" and "actor-critic architecture" but does not provide specific software names or version numbers (e.g., TensorFlow version, PyTorch version, Python version) that would be needed to reproduce the experiments.
Experiment Setup No Training and architecture details are given in the Appendix. The paper explicitly states that experimental setup details are provided in the appendix, meaning they are not in the main text provided.