Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

LocDiff: Identifying Locations on Earth by Diffusing in the Hilbert Space

Authors: Zhangyu Wang, Zeping Liu, Jielu Zhang, Zhongliang Zhou, Qian Cao, Nemin Wu, Lan Mu, Yang Song, Yiqun Xie, Ni Lao, Gengchen Mai

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that Loc Diff can outperform all state-of-the-art grid-based, retrieval-based, and diffusion-based baselines across 5 challenging global-scale image geolocalization datasets, and demonstrates significantly stronger generalizability to unseen geolocations.
Researcher Affiliation Collaboration 1University of Maine, 2SEAI Lab, University of Texas at Austin, 3University of Georgia, 4University of Maryland, 5Google LLC, 6Open AI, 7Harvard University
Pseudocode Yes See Appendix A.5 for the pseudo codes of Loc Diff. Algorithm 1 Training Loc Diff Input : A dataset D with location-image pairs {(p, I)}. Each location is a tuple of latitude and longitude p = (θ, ϕ). A Gaussian noise scheduler N(t), where t is the time step. SHDD encoder PESHDD. Pretrained Image encoder EIm. CS-UNet model M with random initialization. SHDD KL-divergence loss LSHDD-KL. Output :A trained CS-UNet model M. 1 For p, I D: compute the SHDD encoding of p: e PESHDD(θ, ϕ); compute the image embedding of I: e I EIm(I); randomly draw a time step t; add Gaussian noise to the SHDD encoding (forward process): e e + N(t); use the CS-UNet to denoise the SHDD encoding conditioned on the image embedding (backward process): ˆe M(e , e I, t); compute the SHDD KL-divergence loss: l LSHDD-KL(ˆe, e); use gradient decent to minimize l and update M: M arg min M l; Algorithm 2 Inferencing Loc Diff Input : Image I. Random image augmentation AUG. Ensemble number N. DDPM sampler DDPM. DDPM step T. Pretrained Image encoder EIm. Trained CS-UNet model M with random initialization. SHDD mode-seeking decoder PDmode. Mode-seeking range hyperparameter ρ. An initial location p = (0, 0). Output :An ensembled location prediction ˆp.
Open Source Code Yes Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We include the code and access to open-source data in the supplementary materials.
Open Datasets Yes The training dataset is MP16 (Media Eval Placing Tasks 2016 [19]) containing 4.72 million geotagged images. The test datasets are 3 global-scale image geolocalization datasets Im2GPS3k [11], YFCC26k [44], and GWS15k [6]. ... we follow the evaluation setup in [8] and compare our Loc Diff against these generative geolocalization models on two datasets OSV-5M [3] and YFCC-4k [46].
Dataset Splits Yes The training dataset is MP16 (Media Eval Placing Tasks 2016 [19]) containing 4.72 million geotagged images. The test datasets are 3 global-scale image geolocalization datasets Im2GPS3k [11], YFCC26k [44], and GWS15k [6]. Note that the test datasets Im2GPS3k and YFCC-26k have similar distributions to MP16, and more importantly, their data points might overlap with those in MP16, which benefits retrieval-based approaches [45]. ... we follow the evaluation setup in [8] and compare our Loc Diff against these generative geolocalization models on two datasets OSV-5M [3] and YFCC-4k [46].
Hardware Specification Yes We trained our model on a Linux server equipped with four NVIDIA RTX 5500 GPUs, each with 24GB of memory.
Software Dependencies No We implement the DDPM algorithm based on the open-source Py Torch implementation. We use an Adam optimizer. Table 11 lists the details of our training setup.
Experiment Setup Yes Table 11: Training Set-up. Degree L Dimensions Hyperparameters d d I d T batch size lr epochs beta weight decay dropout anchor size 23, 47 576, 2304 768 200 512 0.0001 500 [0.9,0.99] 0.0005 0.3 2048. Table 11 lists the details of our training setup. We use an Adam optimizer.