Nested Named Entity Recognition with Partially-Observed TreeCRFs

Authors: Yao Fu, Chuanqi Tan, Mosha Chen, Songfang Huang, Fei Huang12839-12847

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our approach achieves the state-of-the-art (SOTA) F1 scores on the ACE2004, ACE2005 dataset, and shows comparable performance to SOTA models on the GENIA dataset. We conduct experiments on three standard benchmark datasets.
Researcher Affiliation Collaboration Yao Fu1 , Chuanqi Tan2 , Mosha Chen2, Songfang Huang2, Fei Huang2 1University of Edinburgh 2Alibaba Group
Pseudocode Yes Algorithm 1 SYMBOL TREE AND MASK CONSTRUCTION; Algorithm 2 INSIDE FOR PARTIAL MARGINALIZATION; Algorithm 3 MASKED INSIDE
Open Source Code Yes We release the code at https: //github.com/Franx Yao/Partially-Observed-Tree CRFs.
Open Datasets Yes We conduct experiments on the ACE2004, ACE2005 (Doddington et al. 2004), and GENIA (Kim et al. 2003) datasets.
Dataset Splits Yes The statistics of these datasets are shown in Table 1.
Hardware Specification Yes GPU Nvidia P100, CPU Intel 2.6Hz quad-core i7
Software Dependencies No While specific BERT models (bert-large-cased, Bio BERT v1.1) are mentioned, the paper does not provide specific version numbers for key software libraries like PyTorch or Torch-Struct.
Experiment Setup Yes We use Adam W optimizer with the learning rate 2e-5 on ACE2004 dataset and 3e-5 on ACE2005 and GENIA dataset. The ϵ used for structure smoothing is 0.01 on ACE2004 dataset and 0.02 on ACE2005 and GENIA dataset. We apply 0.2 dropout after BERT encoding. Denote the hidden size of the encoder as h (h = 1024 for BERT Large, and 768 for Bio BERT). We apply two feed-forward layers before the biaffine scoring mechanism, with h and h/2 hidden size, respectively.