Knowledge-aware Named Entity Recognition with Alleviating Heterogeneity

Authors: Binling Nie, Ruixue Ding, Pengjun Xie, Fei Huang, Chen Qian, Luo Si13595-13603

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results demonstrate Ka Na s state-ofthe-art performance on five public benchmark datasets from different domain. Experiments Datasets The main experiments are conducted on five datasets from different domains, including spoken queries (MIT-Movie, MIT-Restaurant), defense and security (RE3D), anatomical (An EM), and biomedical (BC5CDRDisease). Table 1: Experiment results of all methods on the five dataset.
Researcher Affiliation Collaboration Binling Nie,1* Ruixue Ding, 2 Pengjun Xie, 2 Fei Huang, 2 Chen Qian, 3 Luo Si 2 1 Hang Zhou Dianzi University 2 Alibaba Group 3 Tsinghua University binlingnie@hdu.edu.cn, {ada.drx, f.huang, luo.si}@alibaba-inc.com, chengchen.xpj@taobao.com, qc16@mails.tsinghua.edu.cn
Pseudocode No No explicit pseudocode or algorithm blocks were found.
Open Source Code No No explicit statement about releasing source code or a link to a code repository for the described methodology was found.
Open Datasets Yes MIT-Movie (Liu et al. 2013b) contains 10343 movie queries where long constituents are annotated such as a movie s origin and plot descriptions. MIT-Restaurant (Liu et al. 2013a) contains 7,136 restaurant queries. RE3D (DSTL 2017) consists of 1,009 sentences and is relevant to the defence and security analysis domain. An EM (Ohta et al. 2012) is a dataset manually annotated from 500 documents for anatomical entity mentions using a fine-grained classification system. BC5CDR-Disease (Li et al. 2016) is a collection of 1,500 Pub Med titles and abstracts.
Dataset Splits No The paper mentions using 'training data' in the context of the noise detector and evaluates on 'test file', but does not explicitly provide specific train/validation/test dataset splits (percentages or counts) or reference standard splits with citations.
Hardware Specification No No specific hardware details (such as GPU/CPU models, memory, or cloud instance types) used for running experiments were mentioned.
Software Dependencies No The paper mentions software like NCRF++, BERT, Wikidata, Know BERT, and PyTorch-Big Graph, but does not provide specific version numbers for these or other ancillary software components.
Experiment Setup Yes For hyper-parameter settings, the hidden state size of Bi LSTM is 200. The dropout rate is set as 0.2 and 0.5 for Bi LSTM output and pretrained embedding respectively. We apply a two-layer GATs for knowledge injection. The first layer consists of K = 5 attention heads. We set F = 30 for the node dimension of first layer. The last layer is a single attention head where F = C (where C is the number of types in NER). The dropout rate is set to 0.1 and 0.4 for GATs layers and pretrained entity embedding respectively. We use the SGD optimizer in a minibatch size of 10 with learning rate γ = 1 10 3 and L2 regularization λ = 0.005.