DiSProD: Differentiable Symbolic Propagation of Distributions for Planning

Authors: Palash Chatterjee, Ashutosh Chapagain, Weizhe Chen, Roni Khardon

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental An extensive experimental evaluation compares Di SPro D to state-of-the-art planners in discrete-time planning and real-time control of robotic systems. The proposed method improves over existing planners in handling stochastic environments, sensitivity to search depth, sparsity of rewards, and large action spaces. Extensive quantitative experiments are conducted to compare Di SPro D with state-of-the-art planners in discrete-time planning in Open AI Gym environments [Brockman et al., 2016] and continuous-time control of simulated robotic systems.
Researcher Affiliation Academia Palash Chatterjee , Ashutosh Chapagain , Weizhe Chen and Roni Khardon Department of Computer Science, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, Indiana, USA {palchatt, aschap, chenweiz, rkhardon}@iu.edu
Pseudocode Yes Algorithm 1 One Step of Di SPro D. Input: state 1: initialize actions (for all restarts) 2: build computation graph till depth D 3: while actions have not converged do 4: loss = P restart k ˆQ(st, ˆµk a, ˆvk a) 5: loss.backward() 6: actions safe-projected-gradient-update(actions) 7: save action-means ˆµk at+1:t+D from the best restart k 8: return action N(ˆµk at, ˆvk at )
Open Source Code Yes The full paper as well as code to reproduce the experiments and videos from physical experiments are available at https://pecey.github.io/Di SPro D.
Open Datasets Yes Extensive quantitative experiments are conducted to compare Di SPro D with state-of-the-art planners in discrete-time planning in Open AI Gym environments [Brockman et al., 2016] and continuous-time control of simulated robotic systems.
Dataset Splits No The paper does not provide specific training/validation/test dataset split information (exact percentages, sample counts, or detailed splitting methodology) for its experiments. It mentions using 'Open AI Gym environments' but does not detail how data was partitioned within these environments for their experiments.
Hardware Specification No The paper mentions running experiments on a 'Turtle Bot in Gazebo simulation', controlling 'Jackal' and 'Heron' robotic systems, and using the 'Big Red computing system at Indiana University'. However, it does not specify concrete hardware details such as exact GPU/CPU models, processor types, or memory amounts used for these experiments.
Software Dependencies No The paper mentions using 'Open AI Gym environments' and a 'Robot Operating System (ROS) interface', but it does not provide specific version numbers for these software components.
Experiment Setup Yes In the experiments we use 200 restarts, and for Di SPro D lrv = lrµ/10. The values of D and lrµ for Gym environments are provided in Figure 2. For Turtle Bot we use D = 100, lrµ = 10. For Jackal and Heron, D is modified to 30, 70 respectively, and we use 400 restarts.