Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework
Authors: Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We investigate three learning modes of the PDP: inverse reinforcement learning, system identification, and control/planning. We demonstrate the capability of the PDP in each learning mode on different high-dimensional systems, including multi-link robot arm, 6-Do F maneuvering quadrotor, and 6-Do F rocket powered landing. |
| Researcher Affiliation | Academia | Wanxin Jin Zhaoran Wang Purdue University Northwestern University {wanxinjin,zhaoranwang}@gmail.comZhuoran Yang Princeton University zy6@princeton.eduShaoshuai Mou Purdue University mous@purdue.edu |
| Pseudocode | Yes | Algorithm 1: Solving ξθ θ using Auxiliary Control System (See detailed version in Appendix D ) |
| Open Source Code | Yes | Both PDP and environment codes are available at https://github.com/wanxinjin. |
| Open Datasets | No | The paper mentions environments like Cartpole, Two-link robot arm, 6-DoF quadrotor maneuvering, and 6-DoF rocket powered landing, but it does not provide concrete access information (e.g., URL, DOI, specific citation with author/year for a dataset) for the demonstration or collected data used in the experiments. |
| Dataset Splits | No | The paper does not explicitly provide information about training, validation, or test dataset splits. It mentions using 'demonstrations' or 'data collected from, say, a physical system' but no details on how these were split for training or evaluation. |
| Hardware Specification | No | The paper mentions comparing running time (Fig. 5d, Fig. 6) and discusses computational complexity, but it does not specify any hardware details like GPU/CPU models or memory used for the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments (e.g., Python, PyTorch, TensorFlow, specific solvers). |
| Experiment Setup | Yes | We set learning rate η = 10 4 and run five trials given random initial θ0. For all methods, we set learning rate η = 10 4, and run five trials with random θ0. We set learning rate η=10 4 or 10 6 and run five trials for each system. |