Progress & Compress: A scalable framework for continual learning
Authors: Jonathan Schwarz, Wojciech Czarnecki, Jelena Luketina, Agnieszka Grabska-Barwinska, Yee Whye Teh, Razvan Pascanu, Raia Hadsell
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the progress & compress approach on sequential classification of handwritten alphabets as well as two reinforcement learning domains: Atari games and 3D maze navigation. We now provide an assessment of the suitability of P&C as a continual learning method, conducting experiments to test against the desiderata introduced in Section 1. We introduce experiments varying in the nature of the learning task, their difficulty and the similarity between tasks. To evaluate P&C for supervised learning, we first consider the sequential learning of handwritten characters of 50 alphabets taken from the Omniglot dataset (Lake et al., 2015). |
| Researcher Affiliation | Collaboration | 1DeepMind, London, United Kingdom 2Department of Computer Science, University of Oxford, Oxford, United Kingdom. Correspondence to: Jonathan Schwarz <schwarzjn@google.com>, Razvan Pascanu <razp@google.com>. |
| Pseudocode | No | The paper describes the algorithm steps in paragraph form and through mathematical equations (e.g., Equation 1, 2, 3, 4, 5, 6, 7, 8, 9) but does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statement about making its source code publicly available, nor does it provide a link to a code repository. |
| Open Datasets | Yes | To evaluate P&C for supervised learning, we first consider the sequential learning of handwritten characters of 50 alphabets taken from the Omniglot dataset (Lake et al., 2015). Assessing P&C under more challenging conditions, we also consider the sequential learning of 6 games in the Atari suite (Bellemare et al., 2012) (Space Invaders, Krull, Beamrider, Hero, Stargunner and Ms. Pac-man). |
| Dataset Splits | No | The paper mentions using "Omniglot dataset", "Atari suite", and "3D environments", and discusses "test performance" and "held-out maze", but it does not provide specific details on how these datasets were split into training, validation, and test sets (e.g., exact percentages or sample counts), nor does it reference a specific citation for predefined splits. |
| Hardware Specification | No | The paper mentions "distributed variant of the actor-critic architecture" and that "All RL results are obtained by running an identical experiment with 4 random seeds" but does not specify any particular hardware components such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions "neural networks" and "actor-critic architecture" but does not provide specific software names or version numbers (e.g., TensorFlow version, PyTorch version, Python version) that would be needed to reproduce the experiments. |
| Experiment Setup | No | Training and architecture details are given in the Appendix. The paper explicitly states that experimental setup details are provided in the appendix, meaning they are not in the main text provided. |