Model-based Diffusion for Trajectory Optimization
Authors: Chaoyi Pan, Zeji Yi, Guanya Shi, Guannan Qu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations show that MBD outperforms state-of-the-art reinforcement learning and sampling-based TO methods in challenging contact-rich tasks. |
| Researcher Affiliation | Academia | Chaoyi Pan , Zeji Yi , Guanya Shi , Guannan Qu Carnegie Mellon University {chaoyip,zejiy,guanyas,gqu}@andrew.cmu.edu |
| Pseudocode | Yes | Algorithm 1 Model-based Diffusion for Generic Optimization Algorithm 2 Model-based Diffusion for Trajectory Optimization |
| Open Source Code | Yes | Videos and codes: https://lecar-lab.github.io/mbd/ |
| Open Datasets | Yes | For Humanoid Jogging, we use data from the CMU Mocap dataset [1], from which we extract torso, thigh, and shin positions and use them as a partial state reference. |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits. It describes the environments and tasks used for evaluation, but not how a specific dataset was partitioned for training and validation purposes. |
| Hardware Specification | Yes | All the experiments were conducted on a single NVIDIA RTX 4070 Ti GPU. For the BO benchmarks, the experiments were conducted on an A100 GPU because of the high computational demands of the Gaussian Process Regression Model it incorporates. |
| Software Dependencies | No | The paper mentions several software tools and frameworks like Google Brax, PPO, SAC, CMA-ES, CEM, MPPI, pycma, and Nevergrad. However, it does not provide specific version numbers for these software components, which are required for reproducible ancillary software description. |
| Experiment Setup | Yes | We use the same hyperparameters for all the tasks, with small tweaks for harder tasks. Task Name Horizon Sample Number Temperature λ (Table 4) For diffusion noise schedulling, we use simple linear scheduling β0 = 1 10 4 and βN = 1 10 2, and the diffusion step number is 100 across all tasks. For reinforcement learning implementation, we strictly follow the hyperparameters and implementation details provided by the original Brax repository, which optimize for the best performance. The hyperparameters for the RL tasks are shown in Table 5 and Table 6. |