Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Selective Experience Replay for Lifelong Learning
Authors: David Isele, Akansel Cosgun
AAAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We explore four strategies for selecting which experiences will be stored: favoring surprise, favoring reward, matching the global training distribution, and maximizing coverage of the state space. We show that distribution matching successfully prevents catastrophic forgetting, and is consistently the best approach on all domains tested. |
| Researcher Affiliation | Collaboration | David Isele The University of Pennsylvania and Honda Research Institute EMAIL Akansel Cosgun Honda Research Institute EMAIL |
| Pseudocode | No | The paper describes algorithms in prose and uses mathematical formulas but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an unambiguous statement of code release or a direct link to a source code repository for the described methodology. |
| Open Datasets | Yes | We evaluate selective experience replay on the problem of autonomously handling unsigned intersections. The Sumo simulator (Krajzewicz et al. 2012) allows for the creation of a variety of different driving tasks... In the lifelong learning variant of MNIST, the agent is exposed to only two digits. At the end of training on five tasks the agent is expected to be able to correctly classify all 10 digits. |
| Dataset Splits | No | The paper mentions training on a certain number of experiences (e.g., '10,000 training experiences on each task') and evaluating performance via trials (e.g., 'test at each time step involves 100 trials'), but does not specify explicit train/validation/test dataset splits with percentages or counts for the datasets themselves. |
| Hardware Specification | No | The paper does not provide specific details such as GPU/CPU models, memory amounts, or types of computing resources used for the experiments. |
| Software Dependencies | No | The paper mentions the 'Sumo simulator' but does not provide its version number or any other specific software dependencies with their versions. |
| Experiment Setup | No | We describe the details of our network and training parameters in the appendix. |