reproducibilityindex.ai

Dual Policy Iteration

Authors: Wen Sun, Geoffrey J. Gordon, Byron Boots, J. Bagnell

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the efﬁcacy of our approach on various continuous control Markov Decision Processes. To evaluate our approach, we demonstrate our algorithm on discrete MDPs and continuous control tasks. We tested our approach on several MDPs: (1) a set of random discrete MDPs (Garnet problems [7]) (2) Cartpole balancing [31], (3) Helicopter Aerobatics (Hover and Funnel) [32], (4) Swimmer, Hopper and Half-Cheetah from the Mu Jo Co physics simulator [33].
Researcher Affiliation	Collaboration	1School of Computer Science, Carnegie Mellon University, USA 2College of Computing, Georgia Institute of Technology, USA 3Aurora Innovation, USA
Pseudocode	Yes	Algorithm 1 AGGREVATED-GPS
Open Source Code	No	The paper does not provide concrete access to its own source code. It mentions 'Software available from rll.berkeley.edu/gps.' but this refers to a related work's implementation, not the authors' code for this paper.
Open Datasets	Yes	We tested our approach on several MDPs: (1) a set of random discrete MDPs (Garnet problems [7]) (2) Cartpole balancing [31], (3) Helicopter Aerobatics (Hover and Funnel) [32], (4) Swimmer, Hopper and Half-Cheetah from the Mu Jo Co physics simulator [33].
Dataset Splits	No	The paper mentions a training split for robust policy optimization ('We use 7 of the environments for training and the remaining three for testing.'), but does not specify a separate validation set for model tuning or general performance assessment across all experiments.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory specifications) used for running its experiments.
Software Dependencies	No	The paper mentions using the 'Mu Jo Co physics simulator [33]' but does not provide specific version numbers for this or any other software, libraries, or programming languages used in their implementation.
Experiment Setup	Yes	The setup is detailed in Appendix B.4. The setup and the conservative update implementation is detailed in Appendix B.1.