Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference

Authors: Matthew Riemer, Ignacio Cases, Robert Ajemian, Miao Liu, Irina Rish, Yuhai Tu, and Gerald Tesauro

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments across continual lifelong supervised learning benchmarks and non-stationary reinforcement learning environments demonstrating that our approach consistently outperforms recently proposed baselines for continual learning.
Researcher Affiliation Collaboration 1IBM Research, Yorktown Heights, NY 2Linguistics and Computer Science Departments, Stanford NLP Group, Stanford University 3MIT-IBM Watson AI Lab 4Department of Brain and Cognitive Sciences, MIT
Pseudocode Yes Algorithm 1 Meta-Experience Replay (MER) Algorithm 2 Reptile for Stationary Data Algorithm 3 Reservoir Sampling with Algorithm R Algorithm 4 Experience Replay (ER) with Reservoir Sampling Algorithm 5 Experience Replay (ER) with Tasks Algorithm 6 Meta-Experience Replay (MER) One Big Batch Algorithm 7 Meta-Experience Replay (MER) Current Example Learning Rate Algorithm 8 Deep Q-learning with Meta-Experience Replay (MER)
Open Source Code Yes Code available at https://github.com/mattriemer/mer.
Open Datasets Yes We consider two continual learning benchmarks from Lopez-Paz & Ranzato (2017). MNIST Permutations is a variant of MNIST... MNIST Rotations is another variant of MNIST... Additionally, we also explore the Omniglot (Lake et al., 2011) benchmark...
Dataset Splits Yes Following multi-task learning conventions, 90% of the data is used for training and 10% is used for testing (Yang & Hospedales, 2017). ... We follow the standard benchmark setting from Lopez-Paz & Ranzato (2017)...
Hardware Specification No Figure 3 states: 'We plot retained training accuracy, retained testing accuracy, and computation time for the entire training period using one CPU.' However, it does not specify the CPU model, GPU, or any other detailed hardware specifications.
Software Dependencies No The paper mentions environments like 'Pygame learning environment' and uses standard models like 'DQN', but it does not specify version numbers for any software frameworks (e.g., TensorFlow, PyTorch) or libraries used in the experiments.
Experiment Setup Yes We provide detailed information about our architectures and hyperparameters in Appendix J. ... In Appendix J.1 HYPERPARAMETER SEARCH ... Here we report the hyper-parameter grids that we searched over in our experiments. ... For the supervised continual learning benchmarks leveraging MNIST Rotations and MNIST Permutations, following conventions, we use a two layer MLP architecture for all models with 100 hidden units in each layer.