SEPT: Towards Efficient Scene Representation Learning for Motion Prediction
Authors: Zhiqian Lan, Yuxuan Jiang, Yao Mu, Chen Chen, Shengbo Eben Li
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that SEPT, without elaborate architectural design or manual feature engineering, achieves state-of-the-art performance on the Argoverse 1 and Argoverse 2 motion forecasting benchmarks, outperforming previous methods on all main metrics by a large margin. |
| Researcher Affiliation | Academia | School of Vehicle and Mobility, Tsinghua University {lanzq21, jyx21}@mails.tsinghua.edu.cn lishbo@tsinghua.edu.cn |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include any explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | The effectiveness of our approach is verified on Argoverse 1 and Argoverse 2, two widely-used large-scale motion forecasting datasets collected from real world. |
| Dataset Splits | Yes | In the downstream motion prediction training stage, we train and validate following the split of the Argoverse dataset. ... In the scene understanding training stage, we concatenate train, validation and test dataset as the pretrain dataset with labels dropped. |
| Hardware Specification | Yes | Both stages are trained with a batch size of 96 on a single NVIDIA Ge Force RTX 3090 Ti GPU. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA versions) used for the experiments. |
| Experiment Setup | Yes | The model is trained for 150 epochs with a constant learning rate of 2 10 4. ... The model is trained for 50 epochs with the learning rate decayed linearly from 2 10 4 to 0. Both stages are trained with a batch size of 96... In our main experiment, we simply use p MTM = 0.5, p MRM = 0.5 and Th = 8/20 (for Argoverse 1/2) without tuning. Table 6 reports the hyperparameters for the SEPT network architecture. |