Fine-Grained Entity Typing for Domain Independent Entity Linking

Authors: Yasumasa Onoe, Greg Durrett8576-8583

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our entity linking system on the Co NLL-YAGO dataset (Hoffart et al. 2011) and show that our approach outperforms prior domain-independent entity linking systems. We also test our approach in a harder setting derived from the Wikilinks NED dataset (Eshel et al. 2017) where all the mention-entity pairs are unseen during test time. Results indicate that our approach generalizes better than a state-of-the-art neural model on the dataset.
Researcher Affiliation Academia Yasumasa Onoe, Greg Durrett Department of Computer Science The University of Texas at Austin {yasumasa, gdurrett}@cs.utexas.edu
Pseudocode No The paper does not contain any pseudocode or algorithm blocks. Figure 2 presents a model architecture diagram, but not structured pseudocode.
Open Source Code Yes The code for experiments is available at https://github.com/yasumasaonoe/ET4EL
Open Datasets Yes We evaluate our approach on the development/test sets of the Co NLL-YAGO (Hoffart et al. 2011) dataset, which is a widely used entity linking benchmark. Additionally, we test our model in a much harder setting where the mention-entity pairs are unseen during test time. We create the training, development, and test sets from the Wikilinks NED dataset (Eshel et al. 2017). For the Co NLL data, we use the publicly available candidate list, PPRfor NED (Pershina, He, and Grishman 2015).
Dataset Splits Yes To ensure that all mentions in the development and test sets do not appear in the training set, we split the Wikilinks NED training set into train, development, and test sets by unique mentions (15.5k for train, 1k for dev, and 1k for test). This results 2.2M, 10k, and 10k examples respectively.
Hardware Specification No The authors acknowledge the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for providing HPC resources used to conduct this research. Results presented in this paper were obtained using the Chameleon testbed supported by the National Science Foundation. While HPC resources and a testbed are mentioned, no specific hardware components (e.g., GPU model, CPU type, memory size) are detailed.
Software Dependencies No The paper mentions software components like ELMo, BERT, and word2vecf but does not provide specific version numbers for any of them, which is required for reproducibility.
Experiment Setup No The paper states: "We follow Onoe and Durrett (2019) for our entity typing model design and hyperparameter choices." This refers to an external paper for hyperparameter details rather than providing them explicitly within the text.