LoD-Loc: Aerial Visual Localization using LoD 3D Map with Neural Wireframe Alignment

Authors: Juelin Zhu, Shen Yan, Long Wang, zhang shengYue, Yu Liu, Maojun Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on these two datasets. and 4 Experiment Extensive experiments are conducted on the UAVD4L-Lo D and Swiss-EPFL datasets to demonstrate the effectiveness of our proposed model as described in Sec. 4.2. Additionally, ablation studies are conducted on the UAVD4L-Lo D dataset in Sec. 4.3.
Researcher Affiliation Collaboration 1National University of Defense Technology 2Sense Time Research
Pseudocode No No explicitly labeled pseudocode or algorithm blocks were found.
Open Source Code Yes The code and dataset are available at https://victorzoo.github.io/Lo D-Loc.github.io/.
Open Datasets Yes As no public dataset exists for the studied problem, we collect two datasets with map levels of Lo D3.0 and Lo D2.0, along with real RGB queries and ground-truth pose annotations. We benchmark our method and demonstrate that Lo D-Loc achieves excellent performance, even surpassing current state-of-the-art methods that use textured 3D models for localization. The code and dataset are available at https://victorzoo.github.io/Lo D-Loc.github.io/.
Dataset Splits No For the UAVD4L-Lo D dataset, we incorporate a subset of synthesized images from UAVD4L [72], which includes buildings, as training data. For Swiss-EPFL, we train the model by combining synthetic images LHS and real query images from the Cross Loc [74] project, following its data split pattern. The paper describes training data and then uses 'real query images' for inference/testing, but does not explicitly detail a separate validation split with percentages or counts.
Hardware Specification Yes The training and inference of the entire network are executed using 2 NVIDIA RTX 4090 GPUs.
Software Dependencies No The paper mentions software like 'Blender' and deep learning concepts, but does not provide specific version numbers for software dependencies such as libraries or frameworks.
Experiment Setup Yes During training, we set a random seed to limit 3D wireframe points {Pi} to 2, 000 points, and the pose sampling number ml(x), ml(y), ml(z), ml(θ) for level l = 1, 2, 3 is uniformly set to [13, 7, 3] due to constraints related to CUDA memory. The image size is (512, 480) for the UAVD4L dataset and (720, 480) for the Swiss-EPFL dataset. The pose sampling range at level 1 is set as [10, 10, 30, 7.5] which refers to [rp(x), rp(y), rp(z), rp(θ)].