Social-Transmotion: Promptable Human Trajectory Prediction
Authors: Saeed Saadatnejad, Yang Gao, Kaouther Messaoud, Alexandre Alahi
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our approach is validated on multiple datasets, including JTA, JRDB, Pedestrians and Cyclists in Road Traffic, and ETH-UCY. Our experimental results demonstrate that Social-Transmotion outperforms previous models on several datasets. |
| Researcher Affiliation | Academia | Saeed Saadatnejad , Yang Gao , Kaouther Messaoud, Alexandre Alahi Visual Intelligence for Transportation (VITA) laboratory EPFL, Switzerland {firstname.lastname}@epfl.ch |
| Pseudocode | No | The paper includes architectural diagrams (Figure 1, Figure 2) but does not provide any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is publicly available: https://github.com/vita-epfl/social-transmotion. |
| Open Datasets | Yes | We evaluate on three publicly available datasets providing visual cues: the JTA (Fabbri et al., 2018) and the JRDB (Martin-Martin et al., 2021) in the main text, and the Pedestrians and Cyclists in Road Traffic (Kress et al., 2022) in Appendix A.1. Furthermore, we report on the ETH-UCY dataset (Pellegrini et al., 2009; Lerner et al., 2007), that does not contain visual cues in Appendix A.2. |
| Dataset Splits | Yes | JTA dataset: a large-scale synthetic dataset containing 256 training sequences, 128 validation sequences, and 128 test sequences, with a total of approximately 10 million 3D keypoints annotations. For the JRDB dataset, after extracting data, we used the Traj Net++ (Kothari et al., 2021) code base to generate four types of trajectories with acceptance rates of 1.0, 1.0, 1.0, 1.0 . We used gates-ai-lab-2019-02-08 0 for validation, the indoor video packard-poster-session2019-03-20 1 and the outdoor video bytes-cafe-2019-02-07 0 , gates-basement-elevators-201901-17 1 , hewlett-packard-intersection-2019-01-24 0 , huang-lane-2019-02-12 0 , jordan-hall2019-04-22 0 , packard-poster-session-2019-03-20 2 , stlc-111-2019-04-19 0 , svl-meetinggates-2-2019-04-08 0 , svl-meeting-gates-2-2019-04-08 1 , and tressider-2019-03-16 1 for training. |
| Hardware Specification | Yes | All computations were performed on a NVIDIA V100 GPU equipped with 32GB of memory. |
| Software Dependencies | No | The paper mentions using the Adam optimizer and the Traj Net++ code base, but does not provide specific version numbers for these or other software dependencies like programming languages (e.g., Python) or deep learning frameworks (e.g., PyTorch, TensorFlow). |
| Experiment Setup | Yes | The architecture of CMT includes six layers and four heads, whereas ST is constructed with three layers and four heads; both utilize a model dimension of 128. We employed the Adam optimizer (Kingma & Ba, 2014) with an initial learning rate of 1 10 4 , which was reduced by a factor of 0.1 after 80% of the 50 total epochs were completed. We had 30% modality-masking and 10% meta-masking. |