Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization
Authors: Jiahao Wen, Hang Yu, Zhedong Zheng
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments validate that, under diverse weather conditions, our method achieves competitive recall rates compared to state-of-the-art drone geo-localization methods. Notably, it improves Recall@1 by 13.37% under night conditions and by 18.69% under fog and snow conditions. Our code is available at https://github.com/Jahawn-Wen/Weather Prompt. ... 4 Experiment Implementation Details. We adopt XVLM [16] as the backbone... The experimental results on the University-1652 dataset are shown in Tab 1. ... 4.2 Ablation Studies and Further Discussion |
| Researcher Affiliation | Academia | Jiahao Wen1 Hang Yu1 Zhedong Zheng2 1School of Computer Engineering and Science, Shanghai University, China 2Faculty of Science and Technology and Institute of Collaborative Innovation, University of Macau, China EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods using text, equations, and diagrams (e.g., Figure 2, Figure 3) but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/Jahawn-Wen/Weather Prompt. |
| Open Datasets | Yes | Dateset. University-1652 [1] is a large-scale cross-view geo-localization dataset comprising images from 1,652 university locations. ... SUES-200 [67] contains multi-view drone and satellite images from 200 locations in Shanghai, encompassing diverse urban scenes, parks, lakes, and public buildings. |
| Dataset Splits | Yes | University-1652 [1] is a large-scale cross-view geo-localization dataset comprising images from 1,652 university locations. ... The dataset is split into 701 training and 951 test buildings, with no overlap between train and test sets. |
| Hardware Specification | Yes | All experiments have been implemented in Py Torch [66] and conducted on a single NVIDIA RTX A6000 GPU, with an average inference time of 0.024s per query. |
| Software Dependencies | No | We adopt XVLM [16] as the backbone, which is pre-trained on 4M image caption pairs, integrates BERT [63] as the text encoder and Swin Transformer [64] as the image encoder. ... All experiments have been implemented in Py Torch [66] and conducted on a single NVIDIA RTX A6000 GPU... We utilize the imgaug [61] library to synthesize realistic weather variations. |
| Experiment Setup | Yes | Implementation Details. We adopt XVLM [16] as the backbone, which is pre-trained on 4M image caption pairs, integrates BERT [63] as the text encoder and Swin Transformer [64] as the image encoder. The model is optimized using stochastic gradient descent (SGD) [65] with momentum 0.9 and weight decay 0.0005. Training 210 epochs, with the learning rate reduced by 0.1 at epoch 120 and by 0.01 at epoch 180. We resize input images to 384 384 pixels and divide them into 32 32 patches. During training, satellite-view images are augmented via random cropping and horizontal flipping, for drone-view images, we first apply style transformations using the imgaug [61] library and then perform the same random crop and flip augmentations. At test time, we compute the Euclidean distance between query and candidate embeddings to measure similarity. |