Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework

Authors: Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We investigate three learning modes of the PDP: inverse reinforcement learning, system identification, and control/planning. We demonstrate the capability of the PDP in each learning mode on different high-dimensional systems, including multi-link robot arm, 6-Do F maneuvering quadrotor, and 6-Do F rocket powered landing.
Researcher Affiliation Academia Wanxin Jin Zhaoran Wang Purdue University Northwestern University {wanxinjin,zhaoranwang}@gmail.comZhuoran Yang Princeton University zy6@princeton.eduShaoshuai Mou Purdue University mous@purdue.edu
Pseudocode Yes Algorithm 1: Solving ξθ θ using Auxiliary Control System (See detailed version in Appendix D )
Open Source Code Yes Both PDP and environment codes are available at https://github.com/wanxinjin.
Open Datasets No The paper mentions environments like Cartpole, Two-link robot arm, 6-DoF quadrotor maneuvering, and 6-DoF rocket powered landing, but it does not provide concrete access information (e.g., URL, DOI, specific citation with author/year for a dataset) for the demonstration or collected data used in the experiments.
Dataset Splits No The paper does not explicitly provide information about training, validation, or test dataset splits. It mentions using 'demonstrations' or 'data collected from, say, a physical system' but no details on how these were split for training or evaluation.
Hardware Specification No The paper mentions comparing running time (Fig. 5d, Fig. 6) and discusses computational complexity, but it does not specify any hardware details like GPU/CPU models or memory used for the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments (e.g., Python, PyTorch, TensorFlow, specific solvers).
Experiment Setup Yes We set learning rate η = 10 4 and run five trials given random initial θ0. For all methods, we set learning rate η = 10 4, and run five trials with random θ0. We set learning rate η=10 4 or 10 6 and run five trials for each system.