reproducibilityindex.ai

Predictor-Corrector Policy Optimization

Authors: Ching-An Cheng, Xinyan Yan, Nathan Ratliff, Byron Boots

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show, in both theory and simulation, that the convergence rate of several ﬁrst-order model-free algorithms can be improved by PICCOLO. ... To validate the theory, we PICCOLO multiple algorithms in simulation. The experimental results show that the PICCOLOed versions consistently surpass the base algorithm and are robust to model errors.
Researcher Affiliation	Collaboration	Ching-An Cheng 1 2 Xinyan Yan 1 Nathan Ratliff 2 Byron Boots 1 2 1Georgia Tech 2NVIDIA. Correspondence to: Ching-An Cheng <cacheng@gatech.edu>.
Pseudocode	Yes	Algorithm 1 PICCOLO
Open Source Code	Yes	The codes are available at https://github.com/gtrll/rlfamily.
Open Datasets	Yes	robot RL tasks (Cart Pole, Hopper, Snake, and Walker3D) from Open AI Gym (Brockman et al., 2016) with the DART physics engine (Lee et al., 2018)
Dataset Splits	No	The paper mentions using OpenAI Gym environments but does not specify exact training, validation, or test dataset splits or percentages.
Hardware Specification	No	The paper does not provide specific details on the hardware used for running experiments, such as exact GPU or CPU models.
Software Dependencies	No	The paper mentions implementing the algorithm using PyTorch and OpenAI Gym, but it does not specify concrete version numbers for these software dependencies (e.g., 'PyTorch 1.9' or 'OpenAI Gym X.Y').
Experiment Setup	Yes	We implement our algorithm using PyTorch (Paszke etke., 2017) and OpenAI Gym (Brockman et al., 2016). For the Adam optimizer, we use the default parameters in PyTorch (β1 = 0.9, β2 = 0.999, = 10−8). ... ADAM learning rate 0.001. ... Each iteration of the algorithms involves collecting 2000 steps from 10 rollouts. ... For the TRPO and NATGRAD algorithms, we use the implementation from OpenAI Baselines (Dhariwal et al., 2017) with default hyperparameters.