State Abstractions for Lifelong Reinforcement Learning

Authors: David Abel, Dilip Arumugam, Lucas Lehnert, Michael Littman

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4. Experiments We conduct two sets of simple experiments with the goal of illuminating how state abstractions of various forms impact learning and decision making.
Researcher Affiliation Academia 1Department of Computer Science, Brown University. Correspondence to: David Abel <david_abel@brown.edu>.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes We make all our code publicly available for reproduction of results and extension.1 1https://github.com/david-abel/rl_abstraction
Open Datasets No The paper utilizes custom-built grid-world environments ("Color Room," "Four Room") for experiments and does not provide concrete access information (specific link, DOI, repository, or formal citation with authors/year) for a publicly available or open dataset.
Dataset Splits No The paper describes episodic learning over 100 episodes or 250/500 steps, and mentions averaging results and 95% confidence intervals, but it does not specify explicit train/validation/test dataset splits with percentages, sample counts, or citations to predefined splits.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper mentions standard algorithms like Q-Learning and Delayed Q-Learning, but it does not list specific software dependencies or library versions (e.g., Python, PyTorch, TensorFlow versions) that would be needed for replication.
Experiment Setup Yes Each agent is given 250 steps to interact with the MDP. The abstraction parameter is set to ε = 0.01. For PAC abstractions, δ = 0.1 and ε = 0.1. One experiment involves 100 episodes with 500 steps per episode.