Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving

Authors: Yurong You, Yan Wang, Wei-Lun Chao, Divyansh Garg, Geoff Pleiss, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show on the KITTI object detection benchmark that our combined approach yields substantial improvements in depth estimation and stereo-based 3D object detection outperforming the previous state-of-the-art detection accuracy for faraway objects by 40%. Our code is available at https://github.com/mileyan/Pseudo_Lidar_V2. We conduct extensive empirical studies of our approaches on the KITTI object detection benchmark (Geiger et al., 2012; 2013) and achieve remarkable results.
Researcher Affiliation Academia Yurong You 1, Yan Wang 1, Wei-Lun Chao 2, Divyansh Garg1, Geoff Pleiss1, Bharath Hariharan1, Mark Campbell1, and Kilian Q. Weinberger1 1Cornell University, Ithaca, NY 2The Ohio State University, Columbus, OH {yy785, yw763, dg595, gp346, bh497, mc288, kqw4}@cornell.edu chao.209@osu.edu
Pseudocode Yes Algorithm 1: Graph-based depth correction (GDC).
Open Source Code Yes Our code is available at https://github.com/mileyan/Pseudo_Lidar_V2.
Open Datasets Yes We evaluate on the KITTI dataset (Geiger et al., 2013; 2012), which contains 7,481 and 7,518 images for training and testing. We follow (Chen et al., 2015) to separate the 7,481 images into 3,712 for training and 3,769 validation.
Dataset Splits Yes We follow (Chen et al., 2015) to separate the 7,481 images into 3,712 for training and 3,769 validation.
Hardware Specification No With simple optimizations, GDC runs in 90 ms/frame using a single GPU (7.7 ms for KD-tree construction and search).
Software Dependencies No We use PSMNET (Chang & Chen, 2018) as the backbone for our stereo depth estimation network (SDN). We applied the grid_sample function in Py Torch for bilinear interpolation.
Experiment Setup Yes For GDC we set k = 10 and consider adding signal from a (simulated) 4-beam Li DAR, unless stated otherwise. We train PIXOR using RMSProp with momentum 0.9, learning rate 10^-5 (decay by 10 after 50 and 80 epochs) for 90 epochs.