reproducibilityindex.ai

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

Authors: Kevin Li, Abhishek Gupta, Ashwin Reddy, Vitchyr H Pong, Aurick Zhou, Justin Yu, Sergey Levine

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6. Experimental Evaluation In our experimental evaluation we aim to answer the following questions: (1) Can MURAL make effective use of successful outcome examples to solve challenging exploration tasks? (2) Does MURAL scale to dynamically complex tasks? (3) What are the impacts of different design decisions on the effectiveness of MURAL? Further details, videos, and code can be found at https://sites.google.com/view/mural-rl
Researcher Affiliation	Academia	1Department of Electrical Engineering and Computer Sciences, UC Berkeley, Berkeley, USA. Correspondence to: Kevin Li <kevintli@berkeley.edu>, Abhishek Gupta <abhigupta@berkeley.edu>.
Pseudocode	Yes	Algorithm 1 RL with CNML-Based Success Classiﬁers" and "Algorithm 2 MURAL: Meta-learning Uncertainty-aware Rewards for Automated Outcome-driven RL
Open Source Code	Yes	Further details, videos, and code can be found at https://sites.google.com/view/mural-rl
Open Datasets	No	The paper describes various environments and tasks (e.g., maze navigation, robotic manipulation with Sawyer robot, quadruped ant locomotion) rather than specific, named public datasets with explicit access information (link, DOI, citation with author/year). The problem is framed as 'outcome-driven RL' where successful outcomes are provided by the user and on-policy samples are collected, not a fixed pre-existing dataset.
Dataset Splits	No	The paper does not explicitly provide specific percentages, sample counts, or citations for train/validation/test dataset splits needed to reproduce the experiment. It describes how data is sampled during the reinforcement learning process for the classifier but not a fixed, reproducible dataset partition.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup	No	The paper states, 'Further details are in Appendix A.2' and 'More details are included in Appendix A.4 and A.6' for experimental setup, but these appendices are not provided in the main text to give specific hyperparameter values or training configurations.