Probabilistic Differential Dynamic Programming
Authors: Yunpeng Pan, Evangelos Theodorou
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the PDDP framework using two nontrivial simulated examples: i) cart-double inverted pendulum swing-up; ii) six-link robotic arm reaching. We also compare the learning efficiency of PDDP with the classical DDP [1] and PILCO [13][14]. All experiments were performed in MATLAB. |
| Researcher Affiliation | Academia | Yunpeng Pan and Evangelos A. Theodorou Daniel Guggenheim School of Aerospace Engineering Institute for Robotics and Intelligent Machines Georgia Institute of Technology Atlanta, GA 30332 ypan37@gatech.edu, evangelos.theodorou@ae.gatech.edu |
| Pseudocode | Yes | The proposed algorithm can be summarized in Algorithm 1. The algorithm consists of 8 modules. ... Algorithm 1: PDDP algorithm |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement, or code in supplementary materials) for the source code of the described methodology. |
| Open Datasets | No | The paper uses data generated from simulated examples ('Cart-Double Inverted Pendulum' and 'Six-link robotic arm') and states 'We sample 4 initial trajectories...' and 'We sample 2 initial trajectories...'. It does not provide concrete access information (link, DOI, formal citation) for a publicly available or open dataset. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning for their experimental evaluations. |
| Hardware Specification | No | The paper states 'All experiments were performed in MATLAB.' but does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper states 'All experiments were performed in MATLAB.' but does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We sample 4 initial trajectories with time horizon H = 50. ... We sample 2 initial trajectories with time horizon H = 50. ... In Controller learning (Step 5) we compute a local optimal control sequence (16) by backward-propagation of the value function (17). To ensure convergence, we employ the line search strategy as in [2]. We compute the control law as δˆuk = αIk + Lkδzx k. Initially α = 1, then decrease it until the expected cost is smaller than the previous one. |