DV-3DLane: End-to-end Multi-modal 3D Lane Detection with Dual-view Representation

Authors: Yueru Luo, Shuguang Cui, Zhen Li

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on the public benchmark, Open Lane, demonstrate the efficacy and efficiency of DV-3DLane. It achieves state-of-the-art performance. with a remarkable 11.2 gain in F1 score and a substantial 53.5% reduction in errors.
Researcher Affiliation Academia 1 FNii, CUHK-Shenzhen 2 School of Science and Engineering, CUHK-Shenzhen {222010057@link.,shuguangcui@,lizhen@}cuhk.edu.cn
Pseudocode Yes Algorithm 1 Bidirectional Feature Fusion (BFF) Input: Li DAR points Ppt, image I, camera parameters T Output: mm-aware PV features Fpv, BEV features Fbev, mm denotes multi-modal.
Open Source Code Yes The code is available at https://github.com/JMoonr/dv-3dlane.
Open Datasets Yes We evaluate our method on Open Lane Chen et al. (2022), the sole public 3D lane dataset featuring multi-modal sources, Open Lane is a large-scale dataset built on Waymo Open Dataset Sun et al. (2020)
Dataset Splits No The paper mentions training and testing on datasets like Open Lane-1K and Open Lane-300, but it does not specify the exact percentages or sample counts for validation splits, nor does it explicitly cite predefined validation splits within the paper.
Hardware Specification Yes All models are tested on a single V100 GPU
Software Dependencies No The paper mentions using 'Adam optimizer' and 'cosine annealing scheduler' but does not specify software versions for these or other dependencies like Python, PyTorch, or CUDA.
Experiment Setup Yes We set the number of lane queries to 40, and we employ deformable attention with 4 heads, 8 sample points, and 256 embedding dimensions. We use the Adam optimizer Kingma & Ba (2014) with a weight decay of 0.01. The learning rate is set to 2e-4, and our models undergo training for 24 epochs with a batch size of 32. We employ the cosine annealing scheduler Loshchilov & Hutter (2016) with Tmax = 8. Our input images are of resolution 720 960, and we adopt a voxel size of (0.2m, 0.4m) for the X and Y axes.