Learning Cooperative Trajectory Representations for Motion Forecasting
Authors: Hongzhi Ruan, Haibao Yu, Wenxian Yang, Siqi Fan, Zaiqing Nie
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | V2X-Graph is evaluated on V2X-Seq in vehicle-to-infrastructure (V2I) scenarios. To further evaluate on vehicle-to-everything (V2X) scenario, we construct the first real-world V2X motion forecasting dataset V2X-Traj, which contains multiple autonomous vehicles and infrastructure in every scenario. Experimental results on both V2X-Seq and V2X-Traj show the advantage of our method. |
| Researcher Affiliation | Academia | Hongzhi Ruan 1,2 Haibao Yu 1,3 Wenxian Yang 1 Siqi Fan 1 Zaiqing Nie 1 1 Institute for AI Industry Research (AIR), Tsinghua University 2 University of Chinese Academy of Science 3 The University of Hong Kong |
| Pseudocode | Yes | Algorithm 1: Pseudo Labels Generator Input: Ego-view trajectories Tego, Other-view trajectories Tother Output: Cross-views trajectories matching pesudo labels A |
| Open Source Code | Yes | Find the project at https://github.com/AIR-THU/V2X-Graph. |
| Open Datasets | Yes | (1) V2X-Seq [51]. A public large-scale and real-world V2I dataset. (2) V2X-Traj (Ours). To study the effectiveness of V2X-Graph in V2V and broader V2X scenarios, especially its ability to handle more than two views of trajectories, including both V2I and V2V cooperation, we construct the first real-world and public V2X cooperative motion forecasting dataset, termed V2X-Traj. |
| Dataset Splits | Yes | V2X-Traj dataset contains a total of 10,102 scenarios, which are randomly split into the training, validation, and test set, consisting of 6,062, 2,020, and 2,020 scenarios, respectively. |
| Hardware Specification | Yes | The model is trained for 64 epochs with batch size of 64 on a server with 8 NVIDIA RTX 4090s. Then we conduct the inference experiment on single NVIDIA GTX 4090 and compare the inference cost. |
| Software Dependencies | No | The paper mentions using 'Adam W optimizer' and 'Transformer' modules, and re-implementing 'Hi VT', 'Dense TNT', and 'HDGT', but it does not specify version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or other key dependencies. |
| Experiment Setup | Yes | For training, the initial learning rate is set to 1 × 10^−3 and is scheduled according to cosine annealing [27]. The Adam W optimizer [28] is adopted with a weight decay of 1 × 10^−4. The model is trained for 64 epochs with batch size of 64 on a server with 8 NVIDIA RTX 4090s. The dimensions of the hidden feature is set as 128, and the number of heads in all multi-head attention blocks is 16. |