Patched Line Segment Learning for Vector Road Mapping
Authors: Jiakun Xu, Bowen Xu, Gui-Song Xia, Liang Dong, Nan Xue
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we demonstrate how an effective representation of a road graph significantly enhances the performance of vector road mapping on established benchmarks, without requiring extensive modifications to the neural network architecture. Furthermore, our method achieves state-of-the-art performance with just 6 GPU hours of training, leading to a substantial 32-fold reduction in training costs in terms of GPU hours. |
| Researcher Affiliation | Collaboration | Jiakun Xu1, Bowen Xu1, Gui-Song Xia1, Liang Dong2, Nan Xue *1,3 1School of Computer Science, Wuhan University 2Google Inc. 3Ant Group |
| Pseudocode | Yes | We developed a geometrically-meaningful scheme to reconstruct the road graphs from our Pa Li S representation (see our supp. material for the pseudo code) by considering the properties of I-type, X-type and T-type foreground patches in the following three cases: |
| Open Source Code | No | The paper does not provide a direct link to the source code for the methodology, nor does it explicitly state that the code is publicly released. |
| Open Datasets | Yes | We conduct experiments on two widely used datasets: City-Scale dataset (He et al. 2020) and Space Net dataset (Van Etten, Lindenbaum, and Bacastow 2018). |
| Dataset Splits | Yes | City Scale dataset (He et al. 2020) covers 720 km2 area of 20 cities in the United States. It consists of 180 tiles, which we divide into 144, 9, and 27 tiles for training, validation, and testing respectively, following previous methods (He et al. 2020; He, Garg, and Chowdhury 2022; Xu et al. 2023b). ... We use 2040, 127, and 382 images for training, validation, and testing respectively, following the partition used in Sat2Graph (He et al. 2020). |
| Hardware Specification | No | The paper mentions "6 GPU hours of training" but does not specify the type or model of GPU used, or any other specific hardware components like CPU or memory. |
| Software Dependencies | No | The paper mentions using "DLink Net (Zhou, Zhang, and Wu 2018), with the lightweight Res Net-34 (He et al. 2016) as the backbone encoder" and implementation in "CUDA", but it does not specify version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or specific CUDA versions. |
| Experiment Setup | Yes | We use an encoder-decoder network, DLink Net (Zhou, Zhang, and Wu 2018), with the lightweight Res Net-34 (He et al. 2016) as the backbone encoder to extract feature maps for the learning of Pa Li S. ... We use a patch classification head, which consists of four convolution layers all with 3 3 kernels and an MLP layer... we set a regression head with four 3 3 convolution layers and an MLP layer... We set the projection factor t = 10 if the pixel a is projected outside of the line segment otherwise t is set to 1. ... We compute the loss by comparing the soft mask Ssoft with the existing ground truth mask S of road centerlines. Similar to Boundary Former (Lazarow, Xu, and Tu 2022), we employ the DICE (Milletari, Navab, and Ahmadi 2016) loss... The total loss of the Pa Li S learning can be summarized as Ltotal = LS + LM + LL. ... Considering accuracy and efficiency, we set the patch size to 8. |