Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient
Authors: Steven Li, Rickmer Krohn, Tao Chen, Anurag Ajay, Pulkit Agrawal, Georgia Chalvatzaki
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical studies validate DDiff PG s capability to master multimodal behaviors in complex, high-dimensional continuous control tasks with sparse rewards, also showcasing proof-of-concept dynamic online replanning when navigating mazes with unseen obstacles. |
| Researcher Affiliation | Collaboration | Zechu Li1,2 Rickmer Krohn1,3 Tao Chen2 Anurag Ajay2 Pulkit Agrawal2 Georgia Chalvatzaki1,3 1Technical University of Darmstadt 2Massachusetts Institute of Technology 3Hessian.AI |
| Pseudocode | Yes | Fig. 2 provides an overview of the proposed method, and the pseudocode is available in Alg. 1. |
| Open Source Code | Yes | Our project page is available at https://supersglzc.github.io/projects/ddiffpg/. |
| Open Datasets | Yes | The Ant Maze environments are implemented based on the D4RL benchmark [20]. The robotic manipulation environments with Franka are based on [22]. |
| Dataset Splits | No | The paper mentions running experiments with five random seeds and evaluating on 20 episodes, but does not provide specific train/validation/test dataset splits. |
| Hardware Specification | Yes | We use NVIDIA Ge Force RTX 4090 for all experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | Hyperparameters are available in Tab. E. |