Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection

Authors: Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, Houqiang Li1201-1209

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments are conducted on the widely used KITTI Dataset and the more recent Waymo Open Dataset. Our results show that compared to existing voxel-based methods, Voxel R-CNN delivers a higher detection accuracy while maintaining a realtime frame processing rate, i.e., at a speed of 25 FPS on an NVIDIA RTX 2080 Ti GPU.
Researcher Affiliation Academia Jiajun Deng1, Shaoshuai Shi2, Peiwei Li1, Wengang Zhou1,3, Yanyong Zhang4, Houqiang Li1,3 1 CAS Key Laboratory of GIPAS, EEIS Department, University of Science and Technology of China 2 Multimedia Laboratory, The Chinese University of Hong Kong 3 Institute of Artificial Intelligence, Hefei Comprehensive National Science Center 4 Department of Computer Science, University of Science and Technology of China
Pseudocode No The paper includes figures illustrating the architecture and concepts (e.g., Figure 1, 2, 3, 4) and describes operations in text, but it does not contain any formally structured pseudocode blocks or sections explicitly labeled "Algorithm".
Open Source Code Yes The code is available at https://github.com/djiajunustc/Voxel-R-CNN.
Open Datasets Yes Extensive experiments are conducted on the widely used KITTI Dataset and the more recent Waymo Open Dataset.
Dataset Splits Yes As a common practice, the training data are divided into a train set with 3712 samples and a val set with 3769 samples.
Hardware Specification Yes Our results show that compared to existing voxel-based methods, Voxel R-CNN delivers a higher detection accuracy while maintaining a realtime frame processing rate, i.e., at a speed of 25 FPS on an NVIDIA RTX 2080 Ti GPU.
Software Dependencies No The paper mentions conducting experiments with the "Open PCDet" toolbox ("Please refer to Open PCDet 1 for more detailed configurations since we conduct all experiments with this toolbox.") and provides a link to it, but it does not list specific version numbers for Open PCDet itself or any other software dependencies such as Python, PyTorch/TensorFlow, or CUDA.
Experiment Setup Yes For KITTI Dataset, the network is trained for 80 epochs with the batch size 16. For Waymo Open Dataset, the network is trained for 30 epochs with the batch size 32. The learning rate is initialized as 0.01 for both datasets and updated by cosine annealing strategy. In the detect head, the foreground Io U threshold θH is set as 0.75, background Io U threshold θL is set as 0.25, and the box regression Io U threshold θreg is set as 0.55. We randomly sample 128 Ro Is as the training samples of detect head.