Selective Experience Replay for Lifelong Learning
Authors: David Isele, Akansel Cosgun
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We explore four strategies for selecting which experiences will be stored: favoring surprise, favoring reward, matching the global training distribution, and maximizing coverage of the state space. We show that distribution matching successfully prevents catastrophic forgetting, and is consistently the best approach on all domains tested. |
| Researcher Affiliation | Collaboration | David Isele The University of Pennsylvania and Honda Research Institute isele@seas.upenn.edu Akansel Cosgun Honda Research Institute akansel.cosgun@gmail.com |
| Pseudocode | No | The paper describes algorithms in prose and uses mathematical formulas but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an unambiguous statement of code release or a direct link to a source code repository for the described methodology. |
| Open Datasets | Yes | We evaluate selective experience replay on the problem of autonomously handling unsigned intersections. The Sumo simulator (Krajzewicz et al. 2012) allows for the creation of a variety of different driving tasks... In the lifelong learning variant of MNIST, the agent is exposed to only two digits. At the end of training on five tasks the agent is expected to be able to correctly classify all 10 digits. |
| Dataset Splits | No | The paper mentions training on a certain number of experiences (e.g., '10,000 training experiences on each task') and evaluating performance via trials (e.g., 'test at each time step involves 100 trials'), but does not specify explicit train/validation/test dataset splits with percentages or counts for the datasets themselves. |
| Hardware Specification | No | The paper does not provide specific details such as GPU/CPU models, memory amounts, or types of computing resources used for the experiments. |
| Software Dependencies | No | The paper mentions the 'Sumo simulator' but does not provide its version number or any other specific software dependencies with their versions. |
| Experiment Setup | No | We describe the details of our network and training parameters in the appendix. |