Real-Time Motion Prediction via Heterogeneous Polyline Transformer with Relative Pose Encoding
Authors: Zhejun Zhang, Alexander Liniger, Christos Sakaridis, Fisher Yu, Luc V Gool
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on Waymo and Argoverse-2 datasets show that HPTR achieves superior performance among end-to-end methods that do not apply expensive post-processing or model ensembling. |
| Researcher Affiliation | Collaboration | Zhejun Zhang Computer Vision Lab ETH Zurich Zurich, Switzerland zhejun.zhang@vision.ee.ethz.ch Alexander Liniger Computer Vision Lab ETH Zurich Zurich, Switzerland alex.liniger@vision.ee.ethz.ch Christos Sakaridis Computer Vision Lab ETH Zurich Zurich, Switzerland csakarid@vision.ee.ethz.ch Fisher Yu Computer Vision Lab ETH Zurich Zurich, Switzerland i@yf.io Luc Van Gool CVL, ETH Zurich, CH PSI, KU Leuven, BE INSAIT, Un. Sofia, BU vangool@vision.ee.ethz.ch. This work is funded by Toyota Motor Europe via the research project TRACE-Zürich. |
| Pseudocode | No | Figure 7 in the appendix illustrates the implementation of KNARPE with matrix operations, but it is a diagram, not pseudocode in a structured, step-by-step textual format. |
| Open Source Code | Yes | The code is available at https://github.com/zhejz/HPTR. |
| Open Datasets | Yes | We benchmark our method on the two most popular datasets: the Waymo Open Motion Dataset (WOMD) [16] and the Argoverse-2 motion forecasting dataset (AV2) [60]. |
| Dataset Splits | Yes | For WOMD, we randomly sample 25% from all training episodes at each epoch; for AV2 we use 50%. |
| Hardware Specification | Yes | We train with a total batch size of 12 episodes on 4 RTX 2080Ti GPUs. |
| Software Dependencies | No | We use standard Ubuntu, Python and Pytorch without optimizing for real-time deployment. (No version numbers provided for these software dependencies). |
| Experiment Setup | Yes | We use Adam W optimizer with an initial learning rate of 1e-4 and decaying by 0.5 every 25 epochs. We train with a total batch size of 12 episodes... Our final models are trained for 120 epochs for WOMD and 150 epochs for AV2. The base number of neighbors considered by KNARPE is K = 36. This number is multiplied by γTL = 2, γAG = 4 and γAC = 10 respectively for the enhance-TL, enhance-AG and AC-to-all Transformer. We set NAC = 6 to predict exactly 6 futures as specified by the leaderboard. |