Robust 3D Tracking with Quality-Aware Shape Completion

Authors: Jingwen Zhang, Zikun Zhou, Guangming Lu, Jiandong Tian, Wenjie Pei

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Favorable performance against state-of-the-art algorithms on three benchmarks demonstrates the effectiveness and generalization ability of our method. ... We evaluate our algorithm on KITTI (Geiger, Lenz, and Urtasun 2012), Nu Scenes (Caesar et al. 2020), and Waymo Open Dataset (WOD) (Sun et al. 2020).
Researcher Affiliation Academia 1Harbin Institute of Technology, Shenzhen 2Peng Cheng Laboratory 3Shenyang Institute of Automation, Chinese Academy of Sciences
Pseudocode No The paper describes its methods through text and diagrams (Figure 2, Figure 3, Figure 4), but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes We evaluate our algorithm on KITTI (Geiger, Lenz, and Urtasun 2012), Nu Scenes (Caesar et al. 2020), and Waymo Open Dataset (WOD) (Sun et al. 2020).
Dataset Splits Yes KITTI consists of 21 training and 29 test sequences. We split the training set into train/validation/test splits as the test labels are inaccessible, following (Giancola, Zarzar, and Ghanem 2019; Zheng et al. 2022). Nu Scenes comprises 1000 scenes, which are divided into train/validation/test sets. ... WOD contains 1150 scenes, of which 798/202/150 scenes are used for training/validation/testing, respectively.
Hardware Specification Yes We measure the average tracking speed on Car of KITTI on an RTX3090 GPU, which is about 31 FPS.
Software Dependencies No The paper mentions using 'a modified Point Net++ (Qi et al. 2017b) as our backbone' but does not specify version numbers for any software dependencies or libraries.
Experiment Setup Yes We use a modified Point Net++ (Qi et al. 2017b) as our backbone, which is tailored to contain three set-abstraction (SA) layers and three feature propagation (FP) layers. In the three SA layers, the sample radiuses are set to 0.3, 0.5, and 0.7, and the points are randomly sampled to 512, 256, and 128 points, respectively. Similar to (Zheng et al. 2022), we enlarge the target box predicted in the previous frame by 2 meters to obtain the search area in the current frame. We utilize the targetness prediction operation (Zheng et al. 2022) as a pre-process in our tracking framework. At the beginning of tracking, we use the target points lying inside the given box to initialize T. ... We impose smooth-l1 loss (Girshick 2015) on both the coarse and refined boxes to supervise the learning of the tracking model.