reproducibilityindex.ai

Meta-Reinforcement Learning of Structured Exploration Strategies

Authors: Abhishek Gupta, Russell Mendonca, YuXuan Liu, Pieter Abbeel, Sergey Levine

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method on a variety of simulated tasks: locomotion with a wheeled robot, locomotion with a quadrupedal walker, and object manipulation.
Researcher Affiliation	Academia	Abhishek Gupta, Russell Mendonca, Yu Xuan Liu, Pieter Abbeel, Sergey Levine Department of Electrical Engineering and Computer Science University of California, Berkeley {abhigupta, pabbeel, svlevine}@eecs.berkeley.edu {russellm, yuxuanliu}@berkeley.edu
Pseudocode	Yes	Algorithm 1 MAESN meta-RL algorithm
Open Source Code	No	Videos and experimental details for all our experiments can be found at https://sites.google.com/view/meta-explore/. This statement does not explicitly confirm the release of the methodology's source code.
Open Datasets	No	The paper describes custom 'simulated tasks' and 'task distributions' but does not provide concrete access information (link, DOI, specific repository, or formal citation with author/year for a public dataset) for them.
Dataset Splits	No	Rewards are averaged over 100 validation tasks, which have sparse rewards as described in supplementary material. (Figure 3 caption). And averaged across 30 validation tasks (Section 4.3). This mentions validation tasks but not dataset splits in the context of a dataset.
Hardware Specification	No	All experiments were initially run on a local 2 GPU machine, and run at scale using Amazon Web Services. This does not provide specific hardware models or detailed specifications.
Software Dependencies	No	The paper mentions 'trust region policy optimization(TRPO) [24]' and other algorithms but does not provide specific software names with version numbers or library dependencies.
Experiment Setup	No	Hyperparameters of each algorithm are mentioned in the supplementary materials, which were selected via a hyperparameter sweep (also detailed in the appendix). This statement indicates that the details are not in the main text.