Successor Features for Transfer in Reinforcement Learning
Authors: Andre Barreto, Will Dabney, Remi Munos, Jonathan J. Hunt, Tom Schaul, Hado P. van Hasselt, David Silver
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | present experiments that show that it successfully promotes transfer in practice, significantly outperforming alternative methods in a sequence of navigation tasks and in the control of a simulated robotic arm. |
| Researcher Affiliation | Industry | {andrebarreto,wdabney,munos,jjhunt,schaul,hado,davidsilver}@google.com |
| Pseudocode | Yes | The details of QL, PRQL, and SFQL, including their pseudo-codes, are given in Appendix B. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its source code or a link to a code repository for its implementation. |
| Open Datasets | No | The paper describes custom-designed environments and tasks ('four-room domain', 'reacher domain') and how task rewards were sampled, but does not provide access information for a publicly available dataset. |
| Dataset Splits | No | The paper mentions 'training tasks' and 'test tasks' but does not explicitly describe a validation dataset split. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments or training. |
| Software Dependencies | No | The paper mentions software components like 'Mu Jo Co physics engine' and 'DQN' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | No | The paper describes general aspects of the experimental setup and learning protocols, but does not provide specific numerical hyperparameters (e.g., learning rates, batch sizes, epochs) for the training process in the main text, deferring some details to Appendix B. |