Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
A Boolean Task Algebra for Reinforcement Learning
Authors: Geraud Nangue Tasse, Steven James, Benjamin Rosman
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We verify our approach in two domains including a high-dimensional video game environment requiring function approximation where an agent ๏ฌrst learns a set of base skills, and then composes them to solve a super-exponential number of new tasks. We illustrate our approach in the Four Rooms domain (Sutton et al., 1999), where an agent must navigate a grid world to a particular location. We then demonstrate composition in a high-dimensional video game environment, where an agent ๏ฌrst learns to collect different objects, and then composes these abilities to solve complex tasks immediately. Our results show that, even when function approximation is required, an agent can leverage its existing skills to solve new tasks without further learning. |
| Researcher Affiliation | Academia | Geraud Nangue Tasse, Steven James, Benjamin Rosman School of Computer Science and Applied Mathematics University of the Witwatersrand Johannesburg, South Africa |
| Pseudocode | Yes | The full pseudocode is listed in the supplementary material. |
| Open Source Code | No | The paper does not provide an explicit statement or link to the source code for the methodology described in this paper. |
| Open Datasets | No | The paper describes the environments used, such as the "Four Rooms domain (Sutton et al., 1999)" and a "video game environment as Van Niekerk et al. (2019)", which are conceptual or from prior work. However, it does not provide concrete access information (link, DOI, repository, or formal citation with authors/year for a dataset) for any specific dataset used for training in this paper. |
| Dataset Splits | No | The paper does not provide specific dataset split information (percentages, sample counts, or citations to predefined splits) needed to reproduce the data partitioning for training, validation, and testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (like exact GPU/CPU models or processor types) used for running its experiments. |
| Software Dependencies | No | The paper mentions modifying Q-learning and deep Q-learning but does not provide specific software names with version numbers (e.g., library versions like PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | No | While the paper states that "The hyperparameters and network architecture are listed in the supplementary material" in Section 5, it does not provide these specific experimental setup details (concrete hyperparameter values, training configurations, or system-level settings) in the main text of the paper. |