reproducibilityindex.ai

Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient

Authors: Steven Li, Rickmer Krohn, Tao Chen, Anurag Ajay, Pulkit Agrawal, Georgia Chalvatzaki

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical studies validate DDiff PG s capability to master multimodal behaviors in complex, high-dimensional continuous control tasks with sparse rewards, also showcasing proof-of-concept dynamic online replanning when navigating mazes with unseen obstacles.
Researcher Affiliation	Collaboration	Zechu Li1,2 Rickmer Krohn1,3 Tao Chen2 Anurag Ajay2 Pulkit Agrawal2 Georgia Chalvatzaki1,3 1Technical University of Darmstadt 2Massachusetts Institute of Technology 3Hessian.AI
Pseudocode	Yes	Fig. 2 provides an overview of the proposed method, and the pseudocode is available in Alg. 1.
Open Source Code	Yes	Our project page is available at https://supersglzc.github.io/projects/ddiffpg/.
Open Datasets	Yes	The Ant Maze environments are implemented based on the D4RL benchmark [20]. The robotic manipulation environments with Franka are based on [22].
Dataset Splits	No	The paper mentions running experiments with five random seeds and evaluating on 20 episodes, but does not provide specific train/validation/test dataset splits.
Hardware Specification	Yes	We use NVIDIA Ge Force RTX 4090 for all experiments.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies or libraries used in the experiments.
Experiment Setup	Yes	Hyperparameters are available in Tab. E.