reproducibilityindex.ai

Risk-Aware Transfer in Reinforcement Learning using Successor Features

Authors: Michael Gimelfarb, Andre Barreto, Scott Sanner, Chi-Guhn Lee

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on a discrete navigation domain and control of a simulated robotic arm demonstrate the ability of Ra SFs to outperform alternative methods including SFs, when taking the risk of the learned policies into account.Empirical evaluations on discrete navigation and continuous robot control domains (Section 4) demonstrate the ability of Ra SFs to better manage the trade-off between return and risk and avoid catastrophic outcomes, while providing excellent generalization on novel tasks in the same domain.
Researcher Affiliation	Collaboration	Michael Gimelfarb University of Toronto mike.gimelfarb@mail.utoronto.ca André Barreto Deep Mind andrebarreto@google.com Scott Sanner University of Toronto ssanner@mie.utoronto.ca Chi-Guhn Lee University of Toronto cglee@mie.utoronto.ca
Pseudocode	No	The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code	No	The paper does not explicitly state that the source code for their methodology is released, nor does it provide a link to a code repository.
Open Datasets	Yes	To evaluate the performance of Ra SF, we revisit the benchmark domains in Barreto et al. [2], which have been slightly modiﬁed for learning and evaluating risk-aware behaviors.The second domain consists of a set of tasks based on the Mu Jo Co physics engine [42] that involve the maneuver of a robotic arm toward a ﬁxed target location.
Dataset Splits	No	The paper mentions 'training' and 'test' tasks for the Reacher domain but does not provide specific details on validation splits or percentages.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments (e.g., CPU, GPU models, or memory specifications).
Software Dependencies	No	The paper mentions the 'Mu Jo Co physics engine' and 'C51' architecture but does not specify version numbers for these or any other software dependencies, such as programming languages or libraries.
Experiment Setup	No	We defer all experimental details to Appendix C.