reproducibilityindex.ai

Sample Efficient Path Integral Control under Uncertainty

Authors: Yunpeng Pan, Evangelos Theodorou, Michail Kontitsis

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide experimental results on three different tasks and comparisons with state-of-the-art model-based methods to demonstrate the efﬁciency and generalizability of the proposed framework.
Researcher Affiliation	Academia	Yunpeng Pan, Evangelos A. Theodorou, and Michail Kontitsis Autonomous Control and Decision Systems Laboratory Institute for Robotics and Intelligent Machines School of Aerospace Engineering Georgia Institute of Technology, Atlanta, GA 30332 {ypan37,evangelos.theodorou,kontitsis}@gatech.edu
Pseudocode	Yes	Algorithm 1 Sample efﬁcient path integral control under uncertain dynamics
Open Source Code	No	The paper does not provide any information or links regarding open-source code for the described methodology.
Open Datasets	Yes	We consider 3 simulated RL tasks: cart-pole (CP) swing up, double pendulum on a cart (DPC) swing up, and PUMA-560 robotic arm reaching.
Dataset Splits	No	The paper describes the number of sample rollouts used for initialization and trials, but it does not specify explicit training, validation, or test dataset splits in terms of percentages or counts for a fixed dataset, nor does it reference predefined splits for reproducibility.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or other libraries/solvers).
Experiment Setup	Yes	For both tasks we choose T = 1.2 and dt = 0.02 (60 time steps per rollout). The iterative PI [18] with a given dynamics model uses 103/104 (CP/DPC) sample rollouts per iteration and 500 iterations at each time step. We initialize PILCO and the proposed method by collecting 2/6 sample rollouts (corresponding to 120/360 transition samples) for CP/DPC tasks respectively. At each trial (on the true dynamics model), we use 1 sample rollout for PILCO and our method. PDDP uses 4/5 rollouts (corresponding to 240/300 transition samples) for initialization as well as at each trial for the CP/DPC tasks. ... For all tasks we initialize with 3 sample rollouts and 1 sample at each trial.