reproducibilityindex.ai

Zero-Shot Neural Transfer for Cross-Lingual Entity Linking

Authors: Shruti Rijhwani, Jiateng Xie, Graham Neubig, Jaime Carbonell6924-6931

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	With experiments on 9 low-resource languages and transfer through a total of 54 languages, we show that our proposed pivot-based framework improves entity linking accuracy 17% (absolute) on average over the baseline systems, for the zero-shot scenario.
Researcher Affiliation	Academia	Shruti Rijhwani, Jiateng Xie, Graham Neubig, Jaime Carbonell Language Technologies Institute Carnegie Mellon University {srijhwan, jiatengx, gneubig, jgc}@cs.cmu.edu
Pseudocode	No	The paper describes models and architectures (Figure 2, Figure 3) but does not provide any structured pseudocode or algorithm blocks.
Open Source Code	Yes	1Code and data are available at https://github.com/neulab/pivotbased-entity-linking
Open Datasets	Yes	We primarily use Wikipedia links as a test bed because of the availability of data in many LRLs, not found in traditional EL datasets. [...] We also test our proposed PBEL method on the standard cross-lingual EL setting of linking textual mentions to KB entries. For the test set, we use annotated documents from the DARPA LORELEI program4, in two extremely low-resource languages Tigrinya and Oromo. [...] 4https://www.darpa.mil/program/ low-resource-languages-for-emergent-incidents
Dataset Splits	No	The paper mentions training data and test sets, but does not explicitly specify validation dataset splits (e.g., percentages or counts) or cross-validation setup required for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (like exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The encoder is implemented in Dy Net (Neubig et al. 2017), with a character embedding size of 64 and LSTM hidden layer size of 512. - While DyNet is mentioned, a specific version number is not provided, nor are other key software components with versions.
Experiment Setup	Yes	The encoder is implemented in Dy Net (Neubig et al. 2017), with a character embedding size of 64 and LSTM hidden layer size of 512. - Since we want to efﬁciently train a model that can rank KB entries for a given mention, we follow existing work and use negative sampling with a max-margin loss for training the encoder (Collobert et al. 2011).