Zero-Shot Neural Transfer for Cross-Lingual Entity Linking
Authors: Shruti Rijhwani, Jiateng Xie, Graham Neubig, Jaime Carbonell6924-6931
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | With experiments on 9 low-resource languages and transfer through a total of 54 languages, we show that our proposed pivot-based framework improves entity linking accuracy 17% (absolute) on average over the baseline systems, for the zero-shot scenario. |
| Researcher Affiliation | Academia | Shruti Rijhwani, Jiateng Xie, Graham Neubig, Jaime Carbonell Language Technologies Institute Carnegie Mellon University {srijhwan, jiatengx, gneubig, jgc}@cs.cmu.edu |
| Pseudocode | No | The paper describes models and architectures (Figure 2, Figure 3) but does not provide any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1Code and data are available at https://github.com/neulab/pivotbased-entity-linking |
| Open Datasets | Yes | We primarily use Wikipedia links as a test bed because of the availability of data in many LRLs, not found in traditional EL datasets. [...] We also test our proposed PBEL method on the standard cross-lingual EL setting of linking textual mentions to KB entries. For the test set, we use annotated documents from the DARPA LORELEI program4, in two extremely low-resource languages Tigrinya and Oromo. [...] 4https://www.darpa.mil/program/ low-resource-languages-for-emergent-incidents |
| Dataset Splits | No | The paper mentions training data and test sets, but does not explicitly specify validation dataset splits (e.g., percentages or counts) or cross-validation setup required for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details (like exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The encoder is implemented in Dy Net (Neubig et al. 2017), with a character embedding size of 64 and LSTM hidden layer size of 512. - While DyNet is mentioned, a specific version number is not provided, nor are other key software components with versions. |
| Experiment Setup | Yes | The encoder is implemented in Dy Net (Neubig et al. 2017), with a character embedding size of 64 and LSTM hidden layer size of 512. - Since we want to efficiently train a model that can rank KB entries for a given mention, we follow existing work and use negative sampling with a max-margin loss for training the encoder (Collobert et al. 2011). |