reproducibilityindex.ai

Trust Region Policy Optimization

Authors: John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, Philipp Moritz

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate its robust performance on a wide variety of tasks: learning simulated robotic swimming, hopping, and walking gaits; and playing Atari games using images of the screen as input.
Researcher Affiliation	Academia	John Schulman JOSCHU@EECS.BERKELEY.EDU Sergey Levine SLEVINE@EECS.BERKELEY.EDU Philipp Moritz PCMORITZ@EECS.BERKELEY.EDU Michael Jordan JORDAN@CS.BERKELEY.EDU Pieter Abbeel PABBEEL@CS.BERKELEY.EDU University of California, Berkeley, Department of Electrical Engineering and Computer Sciences
Pseudocode	Yes	Algorithm 1 Approximate policy iteration algorithm guaranteeing non-increasing expected cost
Open Source Code	No	The paper does not provide an explicit statement or link to the open-source code for the methodology described.
Open Datasets	Yes	We conducted the robotic locomotion experiments using the Mu Jo Co simulator (Todorov et al., 2012). We tested our algorithms on the same seven games reported on in (Mnih et al., 2013) and (Guo et al., 2014).
Dataset Splits	No	The paper does not provide explicit training/validation/test dataset splits, as it focuses on reinforcement learning in simulated environments (MuJoCo) and game environments (Atari) where data is generated dynamically rather than being a static, pre-split dataset.
Hardware Specification	No	The paper mentions a '16-core computer' but does not provide specific hardware details such as exact GPU/CPU models or memory amounts.
Software Dependencies	No	The paper mentions the Mu Jo Co simulator, but does not provide specific version numbers for it or any other key software dependencies.
Experiment Setup	Yes	We used δ = 0.01 for all experiments. See Table 2 in the Appendix for more details on the experimental setup and parameters used. The parameters used in the experiments are provided in Appendix E.