Language-Driven Interactive Traffic Trajectory Generation
Authors: Junkai XIA, Chenxin Xu, Qingyao Xu, Yanfeng Wang, Siheng Chen
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show our method demonstrates superior performance over previous So TA methods, offering a more realistic generation of interactive traffic trajectories with high controllability via diverse natural language commands. |
| Researcher Affiliation | Collaboration | 1Shanghai Jiao Tong University 2Shanghai AI Laboratory 3 Multi-Agent Governance & Intelligence Crew (MAGIC) |
| Pseudocode | No | The paper describes the methods in text and uses diagrams (e.g., Figure 3 for architecture) but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/X1a-jk/InteractTraj |
| Open Datasets | Yes | We use two datasets, Waymo Open Motion Dataset (WOMD) [59, 60] and nu Plan [61], which both provide real-world vehicle trajectories and corresponding lane maps. |
| Dataset Splits | No | For WOMD, we adopt 68,000 scenarios for training and 2500 scenarios for testing, and for nu Plan, we selected 82,122 scenarios for training and 20,756 scenarios for testing from the whole dataset. Validation split is not explicitly mentioned. |
| Hardware Specification | Yes | It takes about 12 hours for 100 epochs on 4 NVIDIA Ge Force RTX040 GPUs for the decoder training process. |
| Software Dependencies | No | The paper mentions using AdamW optimizer and GPT-4, but does not provide specific version numbers for software libraries or frameworks used in their implementation. |
| Experiment Setup | Yes | In the language-to-code encoder, we sample the vehicles trajectories at 1-second (10 timesteps) intervals to get a |T| = 5 timesteps set. In the code-to-trajectory decoder, the vehicle features DV and interaction features DI are set to 256. During the training process, we train the decoder using the Adam W optimizer [62] with an initial learning rate of 3e-4. |