SwiftPillars: High-Efficiency Pillar Encoder for Lidar-Based 3D Detection
Authors: Xin Jin, Kai Liu, Cong Ma, Ruining Yang, Fei Hui, Wei Wu
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, our proposal accomplishes 61.3% NDS and 53.2% m AP in nu Scenes dataset. In addition, we evaluate inference time on several platforms (P4, T4, A2, MLU370, RTX3080), where Swift Pillars achieves up to 13.3ms (75FPS) on NVIDIA Tesla T4. Compared with Point Pillars, Swift Pillars is on average 26.58% faster in inference speed with equivalent GPUs and a higher m AP of approximately 3.2% in the nu Scenes dataset. |
| Researcher Affiliation | Collaboration | Xin Jin1,2*, Kai Liu2*, Cong Ma2*, Ruining Yang1,2, Fei Hui1 , Wei Wu2,3 1Chang an University 2Sense Auto Research 3Tsinghua University {jinxin, yangruining, feihui}@chd.edu.cn, {liukai3.iag, macong, wuwei}@senseauto.com |
| Pseudocode | No | The paper contains architectural diagrams but no explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states 'We conduct our experiments in the Open PCDet framework (Team 2020).' but does not provide an explicit statement or link for the open-source code of Swift Pillars itself. |
| Open Datasets | Yes | We perform extensive experiments of the proposed Swift Pillars on three major benchmarks of large-scale nu Scenes dataset (Caesar et al. 2020), KITTI (Geiger, Lenz, and Urtasun 2012) and DAIR-V2X (Yu et al. 2022b). |
| Dataset Splits | Yes | nu Scenes is a challenging large-scale dataset that provides 1000 scenes, of which 700 are used for training, 150 for validation, and 150 for testing. |
| Hardware Specification | Yes | In speed evaluation, all models are converted to ONNX on different GPUs (P4, T4, A2, MLU370, RTX3080) with Tensor RT accelerating. [...] P4 , A2 , T4 and RTX3080 mean NVIDIA Tesla P4 , NVIDIA A2 , NVIDIA T4 and NVIDIA Ge Force RTX3080 . |
| Software Dependencies | Yes | Utilizing Tensor RT exec on the NVIDIA Tesla T4 platform with CUDA 11.4 and Tensor RT 8.2.0, we conduct a comparative analysis of the inference speeds of SPE and PFN across different parameters. |
| Experiment Setup | Yes | The models are all trained using the Adam optimizer with one-cycle learning rate strategy with an initial learning rate of 0.001 on 8 NVIDIA V100 GPUs. In speed evaluation, all models are converted to ONNX on different GPUs (P4, T4, A2, MLU370, RTX3080) with Tensor RT accelerating. And their hyperparameters are set to be consistent with those of (Shi, Li, and Ma 2022), with a weight decay of 0.01 and a momentum of 0.9. Additionally, we employ some normal data augmentation, including gtsampling, random flip, random rotation, random scaling. nu Scenes Dataset. We voxelize the 3D space with a detection range of [ 51.2, 51.2, 5.0, 51.2, 51.2, 3.0] using a voxel size of [0.2, 0.2, 8], where the maximum number of points within each voxel is set to 32. The batch size is set to 32 to train 20 epochs, which takes about 15 hours. |