Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Hierarchical Contextualized Representation for Named Entity Recognition
Authors: Ying Luo, Fengshun Xiao, Hai Zhao8441-8448
AAAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results on three benchmark NER datasets (Co NLL-2003 and Ontonotes 5.0 English datasets, Co NLL-2002 Spanish dataset) show that we establish new state-of-the-art results. |
| Researcher Affiliation | Academia | Department of Computer Science and Engineering, Shanghai Jiao Tong University Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, Shanghai, China Mo E Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China EMAIL, EMAIL |
| Pseudocode | No | The paper describes the model architecture and components with equations and figures, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code will be available at https://github.com/cslydia/HireNER. |
| Open Datasets | Yes | Our proposed representations are evaluated on three benchmark NER datasets: Co NLL-2003 (Sang and De Meulder 2003) and Onto Notes 5.0 (Pradhan et al. 2013) English NER datasets, Co NLL-2002 Spanish NER (Tjong Kim Sang 2002) dataset. |
| Dataset Splits | Yes | Co NLL-2003 English NER consists of 22,137 sentences totally and is split into 14,987, 3,466 and 3,684 sentences for the training, development set and test sets, respectively. ... Co NLL-2002 Spanish NER consists of 11,752 sentences totally and is split into 8,322, 1,914 and 1,516 sentences for the training, development and test sets, respectively. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions software like GloVe embeddings and Int Net, but it does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | The batch size is set as 10, the initial learning rate is set to 0.015 and will shrunk by 5% after each epoch. The hidden size of sequence labeling encoder and the sentence-level encoder are set as 256 and 128, respectively. We apply dropout to embeddings and hidden states with a rate of 0.5. The λ used to fuse original hidden state and document-level representation is set as 0.3 empirically. |