reproducibilityindex.ai

A Boolean Task Algebra for Reinforcement Learning

Authors: Geraud Nangue Tasse, Steven James, Benjamin Rosman

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We verify our approach in two domains including a high-dimensional video game environment requiring function approximation where an agent ﬁrst learns a set of base skills, and then composes them to solve a super-exponential number of new tasks. We illustrate our approach in the Four Rooms domain (Sutton et al., 1999), where an agent must navigate a grid world to a particular location. We then demonstrate composition in a high-dimensional video game environment, where an agent ﬁrst learns to collect different objects, and then composes these abilities to solve complex tasks immediately. Our results show that, even when function approximation is required, an agent can leverage its existing skills to solve new tasks without further learning.
Researcher Affiliation	Academia	Geraud Nangue Tasse, Steven James, Benjamin Rosman School of Computer Science and Applied Mathematics University of the Witwatersrand Johannesburg, South Africa
Pseudocode	Yes	The full pseudocode is listed in the supplementary material.
Open Source Code	No	The paper does not provide an explicit statement or link to the source code for the methodology described in this paper.
Open Datasets	No	The paper describes the environments used, such as the "Four Rooms domain (Sutton et al., 1999)" and a "video game environment as Van Niekerk et al. (2019)", which are conceptual or from prior work. However, it does not provide concrete access information (link, DOI, repository, or formal citation with authors/year for a dataset) for any specific dataset used for training in this paper.
Dataset Splits	No	The paper does not provide specific dataset split information (percentages, sample counts, or citations to predefined splits) needed to reproduce the data partitioning for training, validation, and testing.
Hardware Specification	No	The paper does not provide specific hardware details (like exact GPU/CPU models or processor types) used for running its experiments.
Software Dependencies	No	The paper mentions modifying Q-learning and deep Q-learning but does not provide specific software names with version numbers (e.g., library versions like PyTorch 1.9, TensorFlow 2.x).
Experiment Setup	No	While the paper states that "The hyperparameters and network architecture are listed in the supplementary material" in Section 5, it does not provide these specific experimental setup details (concrete hyperparameter values, training configurations, or system-level settings) in the main text of the paper.