Gazetteer-Independent Toponym Resolution Using Geographic Word Profiles

Authors: Grant DeLozier, Jason Baldridge, Loretta London

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Table 2 shows test set performance for all models when resolving gold-standard toponyms.
Researcher Affiliation Academia University of Texas at Austin Austin TX, 78712 {grantdelozier, jbaldrid}@utexas.edu loretta.r.london@gmail.com
Pseudocode No No pseudocode or algorithm blocks are present in the paper.
Open Source Code Yes Topo Cluster code and precomputed local statistic calculations are available online https://github.com/grantdelozier/TopoCluster
Open Datasets Yes For this, we use Geo Wiki, the subset of Wikipedia pages that contain latitude-longitude pairs in their info box. ... We use two corpora used previously by (Speriosu and Baldridge 2013): TR-Co NLL (Leidner 2008) and CWar (Speriosu 2013). ... The Local-Global Lexicon corpus (LGL) was developed by (Lieberman, Samet, and Sankaranarayanan 2010)
Dataset Splits Yes TR-Co NLL was split by Speriosu and Baldridge (2013) into a dev (4,356 Toponyms) and a held-out test set (1,903 Toponyms). ... We use the same split of CWar as (Speriosu and Baldridge 2013): dev (157,000 toponyms) and test (85,000 toponyms).
Hardware Specification No No specific hardware details (GPU/CPU models, memory, etc.) used for running experiments are mentioned in the paper.
Software Dependencies No The paper mentions 'Stanford NER s 3-class CRF model' but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes A grid search was run on the dev portions of the datasets to derive values of three parameters θ1, θ2, and θ3 corresponding to weights on the g of the main toponym, context toponyms, and other context words, respectively. ... Table 1 shows the values obtained for the respective Model-Domain combinations.