Rethinking Boundaries: End-To-End Recognition of Discontinuous Mentions with Pointer Networks
Authors: Hao Fei, Donghong Ji, Bobo Li, Yijiang Liu, Yafeng Ren, Fei Li12785-12793
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the CADEC and Sh ARe13 datasets show that our model outperforms flat and hypergraph models as well as a state-of-the-art transition-based model for discontinuous NER. |
| Researcher Affiliation | Academia | 1 Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan, China 2 Guangdong University of Foreign Studies, Guangzhou, China {hao.fei, dhji, boboli, cslyj, renyafeng}@whu.edu.cn, foxlf823@gmail.com |
| Pseudocode | No | The paper describes the model architecture and processes in text and diagrams, but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the methodology is openly available. |
| Open Datasets | Yes | We experiment on two datasets for discontinuous NER, namely CADEC (Karimi et al. 2015) and Sh ARe13 (Pradhan et al. 2013), both of which are derived from biomedical or clinical domain documents. |
| Dataset Splits | Yes | In Table 1, we present the detailed statistics of two datasets. CADEC #Train 875 #Dev 187 #Test 188, Sh ARe13 #Train 180 #Dev 19 #Test 99. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions various models and optimizers like 'Transformer', 'LSTM', 'Fasttext', 'ELMo', 'Bio BERT', 'Adam optimizer' but does not specify version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | The dimensions of word embeddings, position embeddings and character representations are 300, 30 and 50 respectively. We use the 3-layer Transformer with a 768-dimension hidden size as encoder. The dimensions of all the other intermediate representations are set as 300. The kernel sizes of CNN are [3,4,5]. We adopt the Adam optimizer with an initial learning rate as 1e-4. The mini-batch size is set as 16. Moreover, the initial value of γ is set as 0.85 according to the development experiments. |