Laneformer: Object-Aware Row-Column Transformers for Lane Detection
Authors: Jianhua Han, Xiajun Deng, Xinyue Cai, Zhen Yang, Hang Xu, Chunjing Xu, Xiaodan Liang799-807
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate our Laneformer achieves state-of-the-art performances on CULane benchmark, in terms of 77.1% F1 score. We hope our simple and effective Laneformer will serve as a strong baseline for future research in self-attention models for lane detection. |
| Researcher Affiliation | Collaboration | Jianhua Han1, Xiajun Deng2, Xinyue Cai1, Zhen Yang1, Hang Xu1, Chunjing Xu1, Xiaodan Liang2* 1Huawei Noah s Ark Lab 2Shenzhen Campus of Sun Yat-sen University |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a direct link to open-source code or explicitly state that the code for their method is released. |
| Open Datasets | Yes | We conduct experiments on the two most popular lane detection benchmarks. CULane (Pan et al. 2018) is a large-scale traffic lane detection dataset... Tu Simple (Tu Simple 2017) is an autonomous driving dataset... |
| Dataset Splits | Yes | CULane (Pan et al. 2018) is a large-scale traffic lane detection dataset... It consists of 88,880 training images, 9675 validation images, and 34,680 test images. Tu Simple (Tu Simple 2017)... including 3626 images for training set and 2782 images for the test set. |
| Hardware Specification | Yes | Eight V100s are used to train the model and the batch size is set to be 64. For latency comparison of different components in Laneformer, we conduct experiments on CULane testing split and illustrate the result in Table 3. After adding row-column attention and detection attention, there is only a 4.9%, 8.1% increment on inference FPS due to the efficient matrix multiplication. |
| Software Dependencies | No | The paper mentions using ResNet50 as backbone and Faster-RCNN, but does not provide specific version numbers for these or other software libraries/dependencies. |
| Experiment Setup | Yes | The input resolution is set to 820 × 295 for CULane and 640 × 360 for Tu Simple. The bipartite matching and loss term coefficients ω1, ω2, ω3 and ω4 are set as 2, 10, 10, 10, respectively. Both the number of encoder and decoder layers is set to 1. Moreover, we adopt 25 as the number of queries N and 10 as the number of used detected bounding boxes M. The learning rate is set to 1e-4 for the backbone and 1e-5 for the transformer. We train 100 epochs on CULane and drop the learning rate by ten at 80 epoch. On Tusimple, the total number of training iterations is set to 28k and the learning rate drops at 22k iteration. |