M3SOT: Multi-Frame, Multi-Field, Multi-Space 3D Single Object Tracking

Authors: Jiaming Liu, Yue Wu, Maoguo Gong, Qiguang Miao, Wenping Ma, Cai Xu, Can Qin

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on benchmarks such as KITTI, nu Scenes, and Waymo Open Dataset demonstrate that M3SOT achieves state-of-the-art performance at 38 FPS.
Researcher Affiliation Academia Jiaming Liu1, Yue Wu1*, Maoguo Gong1, Qiguang Miao1, Wenping Ma1, Cai Xu1, Can Qin2 1Xidian University, China 2Northeastern University, USA {ljm@stu., ywu@, qgmiao@, wpma@mail., cxu@}xidian.edu.cn, gong@ieee.org, qin.ca@northeastern.edu
Pseudocode No No structured pseudocode or algorithm blocks were found.
Open Source Code Yes Our code and models are available at https://github.com/ywu0912/Team Code.git.
Open Datasets Yes We compare the proposed M3SOT with state-of-the-art methods on three large datasets: KITTI (Geiger, Lenz, and Urtasun 2012), nu Scenes (Caesar et al. 2020), and Waymo Open Dataset (WOD) (Sun et al. 2020).
Dataset Splits Yes For KITTI, we divide the training sequence into three parts, 0-16 for training, 17-18 for validation, and 19-20 for testing. For the more challenging nu Scenes, we use its validation split to evaluate our model, which contains 150 scenarios.
Hardware Specification Yes Extensive experiments show that M3SOT achieves state-of-the-art performance on three benchmarks while running at 38 FPS on a single NVIDIA RTX 3090 GPU.
Software Dependencies No No specific software dependencies with version numbers were explicitly provided. The paper mentions 'DGCNN' and 'X-RPN' as components and 'Mind Spore, CANN and Ascend AI Processor' in acknowledgments, but without version details.
Experiment Setup Yes Implementation Details. We dilate the ground truth BBox by 2 meters to track possible objects in the area. DGCNN (Wang et al. 2019) with different configurations is used as the feature extractor, and X-RPN (Xu et al. 2023a) with the same parameters is used as the localization head.