reproducibilityindex.ai

Recurrent Predictive State Policy Networks

Authors: Ahmed Hefny, Zita Marinho, Wen Sun, Siddhartha Srinivasa, Geoffrey Gordon

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show the efﬁcacy of RPSP-networks under partial observability on a set of robotic control tasks from Open AI Gym. We empirically show that RPSP-networks perform well compared with memory-preserving networks such as GRUs, as well as ﬁnite memory models, being the overall best performing method.
Researcher Affiliation	Academia	1Machine Learning Department, Carnegie Mellon University, Pittsburgh, USA 2Robotics Institute, Carnegie Mellon University, Pittsburgh, USA 3ISR/IT, Instituto Superior T ecnico, Lisbon, Portugal 4Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, USA.
Pseudocode	Yes	Algorithm 1 Recurrent Predictive State Policy network Optimization (RPSPO)
Open Source Code	Yes	https://github.com/ahefnycmu/rpsp
Open Datasets	Yes	We evaluate the RPSP-network s performance on a collection of reinforcement learning tasks using Open AI Gym Mujoco environments.
Dataset Splits	No	The paper discusses batch sizes and episode lengths for experiments, but does not provide specific percentages or counts for training, validation, or test dataset splits.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU model, CPU type, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using 'Open AI Gym Mujoco environments' and 'RLLab' implementation of TRPO, but it does not specify any version numbers for these or other software dependencies.
Experiment Setup	Yes	For RPSP, we found that a step size of 10 2 performs well for both VRPG and alternating optimization in all environments. The reactive policy contains one hidden layer of 16 nodes with Re LU activation. For each environment, we set the number of samples in the batch to 10000 and the maximum length of each episode to 200, 500, 1000, 1000 for Cart-Pole, Swimmer, Hopper and Walker2d respectively.