LaneSegNet: Map Learning with Lane Segment Perception for Autonomous Driving
Authors: Tianyu Li, Peijin Jia, Bangjun Wang, Li Chen, KUN JIANG, Junchi Yan, Hongyang Li
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On the Open Lane V2 dataset, Lane Seg Net outperforms previous counterparts by a substantial gain across three tasks, i.e., map element detection (+4.8 m AP), centerline perception (+6.9 DETl), and the newly defined one, lane segment perception (+5.6 m AP). |
| Researcher Affiliation | Collaboration | 1Fudan University 2Open Drive Lab 3Tsinghua University 4Shanghai Jiao Tong University |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is accessible at https://github.com/Open Drive Lab/Lane Seg Net. |
| Open Datasets | Yes | We evaluate Lane Seg Net on the popular Open Lane-V2 dataset (Wang et al., 2023). |
| Dataset Splits | Yes | The training set includes about 27,000 frames, and the validation set includes about 4,800 frames. |
| Hardware Specification | Yes | Lane Seg Net is trained using 8 NVIDIA Tesla V100 GPUs, with a total batch size of 8, over the course of 24 training epochs. |
| Software Dependencies | No | The paper mentions optimizers (AdamW) and frameworks (DETR-like paradigm) and architectures (ResNet-50, FPN, BEVFormer), but it does not specify software dependencies with version numbers (e.g., Python, PyTorch, or CUDA versions). |
| Experiment Setup | Yes | The initial learning rate is set to 2 10 4. Lane Seg Net is trained using 8 NVIDIA Tesla V100 GPUs, with a total batch size of 8, over the course of 24 training epochs. The self-attention layer employs 8 attention heads. The cross-attention uses a lane attention module which also has 8 attention heads and incorporates 32 offset points around reference points. We adopt a two-layer FFN with a feed-forward channel size of 512. The initial query embeddings represented as q = [qp, qi], consists of 256 channels qp for generating the initial reference point and the other 256 channels qi for the initial instance query. The number of queries is set to 200 for lane segments. |