reproducibilityindex.ai

Nested Named Entity Recognition with Partially-Observed TreeCRFs

Authors: Yao Fu, Chuanqi Tan, Mosha Chen, Songfang Huang, Fei Huang12839-12847

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that our approach achieves the state-of-the-art (SOTA) F1 scores on the ACE2004, ACE2005 dataset, and shows comparable performance to SOTA models on the GENIA dataset. We conduct experiments on three standard benchmark datasets.
Researcher Affiliation	Collaboration	Yao Fu1 , Chuanqi Tan2 , Mosha Chen2, Songfang Huang2, Fei Huang2 1University of Edinburgh 2Alibaba Group
Pseudocode	Yes	Algorithm 1 SYMBOL TREE AND MASK CONSTRUCTION; Algorithm 2 INSIDE FOR PARTIAL MARGINALIZATION; Algorithm 3 MASKED INSIDE
Open Source Code	Yes	We release the code at https: //github.com/Franx Yao/Partially-Observed-Tree CRFs.
Open Datasets	Yes	We conduct experiments on the ACE2004, ACE2005 (Doddington et al. 2004), and GENIA (Kim et al. 2003) datasets.
Dataset Splits	Yes	The statistics of these datasets are shown in Table 1.
Hardware Specification	Yes	GPU Nvidia P100, CPU Intel 2.6Hz quad-core i7
Software Dependencies	No	While specific BERT models (bert-large-cased, Bio BERT v1.1) are mentioned, the paper does not provide specific version numbers for key software libraries like PyTorch or Torch-Struct.
Experiment Setup	Yes	We use Adam W optimizer with the learning rate 2e-5 on ACE2004 dataset and 3e-5 on ACE2005 and GENIA dataset. The ϵ used for structure smoothing is 0.01 on ACE2004 dataset and 0.02 on ACE2005 and GENIA dataset. We apply 0.2 dropout after BERT encoding. Denote the hidden size of the encoder as h (h = 1024 for BERT Large, and 768 for Bio BERT). We apply two feed-forward layers before the biafﬁne scoring mechanism, with h and h/2 hidden size, respectively.