Behind the Curtain: Learning Occluded Shapes for 3D Object Detection

Authors: Qiangeng Xu, Yiqi Zhong, Ulrich Neumann2893-2901

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on the KITTI Dataset and the Waymo Open Dataset demonstrate the effectiveness of Btc Det.
Researcher Affiliation Academia University of Southern California qiangenx@usc.edu, yiqizhon@usc.edu, uneumann@usc.edu
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes Code is released.
Open Datasets Yes Extensive experiments on the KITTI Dataset and the Waymo Open Dataset demonstrate the effectiveness of Btc Det.
Dataset Splits Yes The KITTI Dataset includes 7481 Li DAR frames for training and 7518 Li DAR frames for testing. We follow (Chen et al. 2017) to divide the training data into a train split of 3712 frames and a val split of 3769 frames. The Waymo Open Dataset (WOD) consists of 798 segments of 158361 Li DAR frames for training and 202 segments of 40077 Li DAR frames for validation.
Hardware Specification Yes In all of our experiments, we train our models with a batch size of 8 on 4 GTX 1080 Ti GPUs.
Software Dependencies No The paper mentions "The Btc Det is end-to-end optimized by the ADAM optimizer (Kingma and Ba 2014) from scratch." but does not provide specific version numbers for software libraries or dependencies beyond this.
Experiment Setup Yes Determined by grid search, we set γ = 2 in Eq.6, δ = 0.2 in Eq.7 and µ = 1.05 in Eq.10. In all of our experiments, we train our models with a batch size of 8 on 4 GTX 1080 Ti GPUs. On the KITTI Dataset, we train Btc Det for 40 epochs, while on the WOD, we train Btc Det for 30 epochs. The Btc Det is end-to-end optimized by the ADAM optimizer (Kingma and Ba 2014) from scratch. We applies the widely adopted data augmentations (Shi et al. 2020; Deng et al. 2020; Lang et al. 2019; Yang et al. 2020; Ye et al. 2020), which includes flipping, scaling, rotation and the ground-truth augmentation.