reproducibilityindex.ai

Count-Based Exploration in Feature Space for Reinforcement Learning

Authors: Jarryd Martin, Suraj Narayanan S., Tom Everitt, Marcus Hutter

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our evaluation demonstrates that this simple approach achieves near state-of-the-art performance on highdimensional RL benchmarks.
Researcher Affiliation	Academia	Jarryd Martin, Suraj Narayanan S., Tom Everitt, Marcus Hutter Research School of Computer Science, Australian National University, Canberra jarrydmartinx@gmail.com, surajx@gmail.com, tom.everitt@anu.edu.au, marcus.hutter@anu.edu.au
Pseudocode	Yes	Algorithm 1 Reinforcement Learning with LFA and φ-EB.
Open Source Code	No	The paper does not provide any statement about making its source code publicly available or link to a code repository.
Open Datasets	Yes	We evaluate our algorithm on ﬁve games from the Arcade Learning Environment (ALE), which has recently become a standard high-dimensional benchmark for RL [Bellemare et al., 2013].
Dataset Splits	No	The paper mentions training and evaluation (testing) frames and episodes but does not specify a separate validation dataset split.
Hardware Specification	No	The paper does not specify any hardware details (e.g., GPU models, CPU types, or cloud resources) used for running the experiments.
Software Dependencies	No	The paper mentions using Sarsa(λ) and a Blob-PROST feature set but does not provide specific version numbers for any software, libraries, or frameworks.
Experiment Setup	Yes	The β coefﬁcient in the φ-exploration bonus was set to 0.05 for all games, after a coarse parameter search. This search was performed once, across a range of ALE games, and a value was chosen for which the agent achieved good scores in most games. The parameters for the Sarsa(λ) algorithm are set to the same values as in [Liang et al., 2016].