Successor Features for Transfer in Reinforcement Learning

Authors: Andre Barreto, Will Dabney, Remi Munos, Jonathan J. Hunt, Tom Schaul, Hado P. van Hasselt, David Silver

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental present experiments that show that it successfully promotes transfer in practice, significantly outperforming alternative methods in a sequence of navigation tasks and in the control of a simulated robotic arm.
Researcher Affiliation Industry {andrebarreto,wdabney,munos,jjhunt,schaul,hado,davidsilver}@google.com
Pseudocode Yes The details of QL, PRQL, and SFQL, including their pseudo-codes, are given in Appendix B.
Open Source Code No The paper does not provide an explicit statement about releasing its source code or a link to a code repository for its implementation.
Open Datasets No The paper describes custom-designed environments and tasks ('four-room domain', 'reacher domain') and how task rewards were sampled, but does not provide access information for a publicly available dataset.
Dataset Splits No The paper mentions 'training tasks' and 'test tasks' but does not explicitly describe a validation dataset split.
Hardware Specification No The paper does not specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments or training.
Software Dependencies No The paper mentions software components like 'Mu Jo Co physics engine' and 'DQN' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup No The paper describes general aspects of the experimental setup and learning protocols, but does not provide specific numerical hyperparameters (e.g., learning rates, batch sizes, epochs) for the training process in the main text, deferring some details to Appendix B.