Predictor-Corrector Policy Optimization
Authors: Ching-An Cheng, Xinyan Yan, Nathan Ratliff, Byron Boots
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show, in both theory and simulation, that the convergence rate of several first-order model-free algorithms can be improved by PICCOLO. ... To validate the theory, we PICCOLO multiple algorithms in simulation. The experimental results show that the PICCOLOed versions consistently surpass the base algorithm and are robust to model errors. |
| Researcher Affiliation | Collaboration | Ching-An Cheng 1 2 Xinyan Yan 1 Nathan Ratliff 2 Byron Boots 1 2 1Georgia Tech 2NVIDIA. Correspondence to: Ching-An Cheng <cacheng@gatech.edu>. |
| Pseudocode | Yes | Algorithm 1 PICCOLO |
| Open Source Code | Yes | The codes are available at https://github.com/gtrll/rlfamily. |
| Open Datasets | Yes | robot RL tasks (Cart Pole, Hopper, Snake, and Walker3D) from Open AI Gym (Brockman et al., 2016) with the DART physics engine (Lee et al., 2018) |
| Dataset Splits | No | The paper mentions using OpenAI Gym environments but does not specify exact training, validation, or test dataset splits or percentages. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for running experiments, such as exact GPU or CPU models. |
| Software Dependencies | No | The paper mentions implementing the algorithm using PyTorch and OpenAI Gym, but it does not specify concrete version numbers for these software dependencies (e.g., 'PyTorch 1.9' or 'OpenAI Gym X.Y'). |
| Experiment Setup | Yes | We implement our algorithm using PyTorch (Paszke etke., 2017) and OpenAI Gym (Brockman et al., 2016). For the Adam optimizer, we use the default parameters in PyTorch (β1 = 0.9, β2 = 0.999, = 10−8). ... ADAM learning rate 0.001. ... Each iteration of the algorithms involves collecting 2000 steps from 10 rollouts. ... For the TRPO and NATGRAD algorithms, we use the implementation from OpenAI Baselines (Dhariwal et al., 2017) with default hyperparameters. |