Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Reconciling Geospatial Prediction and Retrieval via Sparse Representations

Authors: YI LI, CHEN YUANLONG, Weiming Huang, Xiaoli Li, Gao Cong

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on real-world datasets demonstrate 25.16% gains in prediction accuracy and 20.76% improvements in retrieval precision over stateof-the-art baselines, alongside 65.97% faster training. These advantages position Urban Sparse as a scalable solution for large urban datasets. In this section, we evaluate the output representations on geographic prediction and retrieval tasks following previous literature [1, 39]. We also perform efficiency and ablation studies.
Researcher Affiliation	Academia	Yi Li College of Computing and Data Science Nanyang Technological University EMAIL Yuanlong Chen College of Computing and Data Science Nanyang Technological University EMAIL Weiming Huang School of Geography, University of Leeds EMAIL Xiaoli Li Institute for Infocomm Research, A*STAR College of Computing and Data Science Nanyang Technological University EMAIL Gao Cong College of Computing and Data Science Nanyang Technological University EMAIL
Pseudocode	Yes	B Training Algorithm We hereby detail the training algorithm used to balance the training on retrieval and prediction tasks in our framework. Algorithm 1 Two-Phase Training of Urban Sparse 1: Input: Dpred (region data), Dretr (query-object pairs), Ewarm = 3, Etotal = 20 2: Parameter: Model fθ with codebook C 3: procedure WARM-UP PHASE 4: for epoch = 1 to Ewarm do 5: for each batch B Dpred do 6: Compute Lpred 7: Update θ θ η θLpred 8: end for 9: end for 10: end procedure 11: procedure ALTERNATING PHASE 12: for epoch = Ewarm + 1 to Etotal do 13: Shuffle Dpred and Dretr 14: for i = 1 to max(\|Dpred\|, \|Dretr\|) do 15: Sample batch Bp Dpred, Br Dretr 16: Compute Lpred on Bp via Eq.3 17: Compute Lretr on Br via Lambda Rank 18: Update θ θ η θ(Lpred + Lretr) 19: end for 20: end for 21: end procedure
Open Source Code	Yes	2Data and code available at https://github.com/pkuliyi2015/Urban Sparse
Open Datasets	Yes	All datasets (or their corresponding embeddings/Bloom filters) used in this paper are publicly available. Table 9 lists each data type along with its source and download link. Table 9: Data sources and download links Data Type Source Link POI datasets and queries Meituan https://anonymous.4open.science/r/Urban Sparse Population density World Pop https://hub.worldpop.org House prices Beike https://ke.com Administrative boundaries GADM https://gadm.org
Dataset Splits	Yes	For retrieval tasks, we requested and got the established benchmark from [39], which has a fixed split with the train/dev/val ratio 0.81:0.09:0.10. As the splits are fixed without randomness, the standard deviations appear to be very small (< 0.003 for all methods) and we omit the standard deviations in our table. For prediction tasks, we follow common practice of unsupervised representation learning, evaluating the learned representations with scikit-learn Random Forest Regressor on all urban regions using 5-fold cross-validation. We strictly repeat all experiments 10 times, report the average results and standard deviations without cherry-picking.
Hardware Specification	Yes	We evaluate the training time of Urban Sparse against top-performing baselines on 1 NVIDIA V100 32GB. All experiments are conducted on 1 NVIDIA V100 32 GB.
Software Dependencies	No	The paper mentions "scikit-learn Random Forest Regressor" and "Py Torch" (in Appendix F.5) and "CUDA kernels" (in Section 5.4), but specific version numbers for these software components are not provided in the paper.
Experiment Setup	Yes	T(q, o) = Sigmoid (β1Text Sim(q, o) + β2) (6) D(q, o) = log(1 + Dist(q, o)) (7) Relevance(q, n) = T(q, n) + γ1D(q, n) + γ2T(q, n)D(q, n) (8) Here, we use the logarithm function to align the distance with human spatial perceptions, i.e., individuals are more sensitive to differences in proximity with nearby objects, while this sensitivity diminishes for objects further apart. The normalization of the text similarities facilitates its smooth combination with geographic distances. β1, β2, γ1, γ2 rescale and balance the influence of two similarities and their first-order interaction, which better excludes proximate objects with little text similarities. We initially set β2 = γ2 = 0 and β1 = γ1 = 1, and train these parameters together with the neural networks via Lambda Rank [3] loss. To resolve this, we employs two-phase training: (1) Warm-up Phase: Train exclusively on prediction tasks for some (i.e., 2-3) epochs, and (2) Alternating Phase: Iteratively training on prediction and retrieval data batches. (full algorithm in Appendix B). For the proposed Urban Sparse, we fix the Bloom filter length to m = 8192 with k = 2 SHA-256 hash functions. In prediction tasks, we set the output region representation dimension d = 64.