LoD-Loc: Aerial Visual Localization using LoD 3D Map with Neural Wireframe Alignment
Authors: Juelin Zhu, Shen Yan, Long Wang, zhang shengYue, Yu Liu, Maojun Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on these two datasets. and 4 Experiment Extensive experiments are conducted on the UAVD4L-Lo D and Swiss-EPFL datasets to demonstrate the effectiveness of our proposed model as described in Sec. 4.2. Additionally, ablation studies are conducted on the UAVD4L-Lo D dataset in Sec. 4.3. |
| Researcher Affiliation | Collaboration | 1National University of Defense Technology 2Sense Time Research |
| Pseudocode | No | No explicitly labeled pseudocode or algorithm blocks were found. |
| Open Source Code | Yes | The code and dataset are available at https://victorzoo.github.io/Lo D-Loc.github.io/. |
| Open Datasets | Yes | As no public dataset exists for the studied problem, we collect two datasets with map levels of Lo D3.0 and Lo D2.0, along with real RGB queries and ground-truth pose annotations. We benchmark our method and demonstrate that Lo D-Loc achieves excellent performance, even surpassing current state-of-the-art methods that use textured 3D models for localization. The code and dataset are available at https://victorzoo.github.io/Lo D-Loc.github.io/. |
| Dataset Splits | No | For the UAVD4L-Lo D dataset, we incorporate a subset of synthesized images from UAVD4L [72], which includes buildings, as training data. For Swiss-EPFL, we train the model by combining synthetic images LHS and real query images from the Cross Loc [74] project, following its data split pattern. The paper describes training data and then uses 'real query images' for inference/testing, but does not explicitly detail a separate validation split with percentages or counts. |
| Hardware Specification | Yes | The training and inference of the entire network are executed using 2 NVIDIA RTX 4090 GPUs. |
| Software Dependencies | No | The paper mentions software like 'Blender' and deep learning concepts, but does not provide specific version numbers for software dependencies such as libraries or frameworks. |
| Experiment Setup | Yes | During training, we set a random seed to limit 3D wireframe points {Pi} to 2, 000 points, and the pose sampling number ml(x), ml(y), ml(z), ml(θ) for level l = 1, 2, 3 is uniformly set to [13, 7, 3] due to constraints related to CUDA memory. The image size is (512, 480) for the UAVD4L dataset and (720, 480) for the Swiss-EPFL dataset. The pose sampling range at level 1 is set as [10, 10, 30, 7.5] which refers to [rp(x), rp(y), rp(z), rp(θ)]. |