Hierarchical Contextualized Representation for Named Entity Recognition

Authors: Ying Luo, Fengshun Xiao, Hai Zhao8441-8448

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results on three benchmark NER datasets (Co NLL-2003 and Ontonotes 5.0 English datasets, Co NLL-2002 Spanish dataset) show that we establish new state-of-the-art results.
Researcher Affiliation Academia Department of Computer Science and Engineering, Shanghai Jiao Tong University Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, China Mo E Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China {kingln, felixxiao}@sjtu.edu.cn, zhaohai@cs.sjtu.edu.cn
Pseudocode No The paper describes the model architecture and components with equations and figures, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Code will be available at https://github.com/cslydia/HireNER.
Open Datasets Yes Our proposed representations are evaluated on three benchmark NER datasets: Co NLL-2003 (Sang and De Meulder 2003) and Onto Notes 5.0 (Pradhan et al. 2013) English NER datasets, Co NLL-2002 Spanish NER (Tjong Kim Sang 2002) dataset.
Dataset Splits Yes Co NLL-2003 English NER consists of 22,137 sentences totally and is split into 14,987, 3,466 and 3,684 sentences for the training, development set and test sets, respectively. ... Co NLL-2002 Spanish NER consists of 11,752 sentences totally and is split into 8,322, 1,914 and 1,516 sentences for the training, development and test sets, respectively.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for running the experiments.
Software Dependencies No The paper mentions software like GloVe embeddings and Int Net, but it does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes The batch size is set as 10, the initial learning rate is set to 0.015 and will shrunk by 5% after each epoch. The hidden size of sequence labeling encoder and the sentence-level encoder are set as 256 and 128, respectively. We apply dropout to embeddings and hidden states with a rate of 0.5. The λ used to fuse original hidden state and document-level representation is set as 0.3 empirically.