SceneDiffuser: Efficient and Controllable Driving Simulation Initialization and Rollout
Authors: Max Jiang, Yijing Bai, Andre Cornman, Christopher Davis, XIUKUN HUANG, Hong Jeon, Sakshum Kulshrestha, John Lambert, Shuangyu Li, Xuanyu Zhou, Carlos Fuertes, Chang Yuan, Mingxing Tan, Yin Zhou, Dragomir Anguelov
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our approach on the Waymo Open Sim Agents Challenge, achieving top open-loop performance and the best closed-loop performance among diffusion models. and 4 Experimental Results |
| Researcher Affiliation | Industry | All the authors are employees of Waymo LLC. |
| Pseudocode | Yes | We illustrate the three algorithms in Algorithm 1-3 using the same model trained with a noise mixture t {U(0, 1); ˆt} (Eqn. 2). We also illustrate Algorithm 3 in Fig. 4.Algorithm 1 One-Shot (Open-Loop)Algorithm 2 Full AR (Closed-Loop)Algorithm 3 Amortized AR (Closed-Loop) |
| Open Source Code | No | We do not plan to release code in the near future. |
| Open Datasets | Yes | Dataset We use the Waymo Open Motion Dataset (WOMD)[7] for both our scene generation and agent simulation experiments. and our dataset is based on the Waymo Open Motion Dataset, which is already publicly accessible. |
| Dataset Splits | Yes | Across the dataset splits, there exists 486,995 scenarios in train, 44,097 in validation, and 44,920 in test. |
| Hardware Specification | No | The paper mentions 'computational resources' and provides 'compute GFLOPs' in Figure 6, but does not specify details like exact GPU/CPU models or memory used for experiments. |
| Software Dependencies | No | The paper mentions software like 'Adafactor optimizer', 'Adam', and 'DPM++', but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Training details Train batch size of 1024, and train for 1.2M steps. We select the most competitive model based on validation set performance, for which we perform a final evaluation using the test set. We use an initial learning rate of 3 10 4. We use 16 diffusion sampling steps. When training, we mix the behavior prediction (BP) task with the scene generation task, with probability 0.5. The randomized control mask is applied to both tasks. |