Selective Experience Replay for Lifelong Learning

Authors: David Isele, Akansel Cosgun

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We explore four strategies for selecting which experiences will be stored: favoring surprise, favoring reward, matching the global training distribution, and maximizing coverage of the state space. We show that distribution matching successfully prevents catastrophic forgetting, and is consistently the best approach on all domains tested.
Researcher Affiliation Collaboration David Isele The University of Pennsylvania and Honda Research Institute isele@seas.upenn.edu Akansel Cosgun Honda Research Institute akansel.cosgun@gmail.com
Pseudocode No The paper describes algorithms in prose and uses mathematical formulas but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an unambiguous statement of code release or a direct link to a source code repository for the described methodology.
Open Datasets Yes We evaluate selective experience replay on the problem of autonomously handling unsigned intersections. The Sumo simulator (Krajzewicz et al. 2012) allows for the creation of a variety of different driving tasks... In the lifelong learning variant of MNIST, the agent is exposed to only two digits. At the end of training on five tasks the agent is expected to be able to correctly classify all 10 digits.
Dataset Splits No The paper mentions training on a certain number of experiences (e.g., '10,000 training experiences on each task') and evaluating performance via trials (e.g., 'test at each time step involves 100 trials'), but does not specify explicit train/validation/test dataset splits with percentages or counts for the datasets themselves.
Hardware Specification No The paper does not provide specific details such as GPU/CPU models, memory amounts, or types of computing resources used for the experiments.
Software Dependencies No The paper mentions the 'Sumo simulator' but does not provide its version number or any other specific software dependencies with their versions.
Experiment Setup No We describe the details of our network and training parameters in the appendix.