Knowledge-Graph Augmented Word Representations for Named Entity Recognition

Authors: Qizhen He, Liang Wu, Yida Yin, Heming Cai7919-7926

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that KAWR, as an augmented version of the existing linguistic word representations, promotes F1 scores on 5 datasets in various domains by +0.46~+2.07. Better generalization is also observed for KAWR on new entities that cannot be found in the training sets. In our experiments, we build KAWR based on the pre-trained BERT and a KG embedding model, and compare KAWR with the original BERT on 5 datasets covering various domains.
Researcher Affiliation Industry Bilibili, Shanghai, China {heqizhen, wuliang, yinyida, caiheming}@bilibili.com
Pseudocode No The paper provides mathematical formulations and diagrams of its model architecture (e.g., Figure 1 for GERU) but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statement about releasing its source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes Wikidata, hosted by Wikimeda Foundation, is a free and open knowledge database that can be edited collaboratively and used by anyone under a public domain license. With growing data donations such as migrating of Freebase by Google, Wikidata now covers tens of millions of entities with descriptions and statements in various domains, and is becoming an important source to wiki-pages, such as Wikipedia, Wikivoyage and Wikiquotes. The specifications of the data sets are listed in Table 2. Co NLL2003, Genia, NCBI, SEC, WNUT16.
Dataset Splits No The paper lists 'Training size' and 'Testing size' in Table 2 for the datasets but does not explicitly provide information about a validation set or its split percentages/counts.
Hardware Specification Yes All experiments were carried on Power Edge C4130 with Tesla P40 GPU with 20 GB of memory.
Software Dependencies No The training process are implemented using Tensor Flow. Facebook recently proposed a distributed system which is called Py Torch-Big Graph(PBG), for learning embeddings of extremely big graphs (Lerer et al., 2019). The paper mentions software used (Tensor Flow, PyTorch-Big Graph) but does not provide specific version numbers for these or other libraries/solvers.
Experiment Setup Yes The models are trained using Adam Weight Decay Optimizer which is based on stochastic gradient descent, and the hyper parameters are listed in Table 1. Hyper-Parameter Value Batch-size 16 Learning rate 2e-5 KG-embedding Dimension 200 Word embedding Dimension 768 GERU hidden status Dimension 200