Attention-based Multi-level Feature Fusion for Named Entity Recognition

Authors: Zhiwei Yang, Hechang Chen, Jiawei Zhang, Jing Ma, Yi Chang

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on four benchmark datasets show that our proposed model outperforms a set of state-of-the-art baselines.
Researcher Affiliation Academia 1College of Computer Science and Technology, Jilin University, Changchun, China 2School of Artificial Intelligence, Jilin University, Changchun, China 3IFM Lab, Department of Computer Science, Florida State University, Tallahassee FL, USA 4Department of Computer Science, Hong Kong Baptist University, Hong Kong, China 5Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, China 6International Center of Future Science, Jilin University, Changchun, China
Pseudocode No The paper describes the model architecture and mathematical formulations but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement or link regarding the availability of open-source code for the described methodology.
Open Datasets Yes To verify the effectiveness of the proposed framework, we conduct experiments on the following four datasets, Co NLL2003 [Sang and De Meulder, 2003], NCBI-disease [Do gan et al., 2014], Sci ERC [Luan et al., 2018], and JNLPBA [Kim et al., 2004].
Dataset Splits Yes All datasets have been separated into train/develop/test sets, including 4/1/6/5 entity types, respectively. Table 1 presents some statistics of the 4 datasets.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments.
Software Dependencies No The paper mentions using GloVe pretrained word embeddings and training with SGD, but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup No The paper mentions that dropout is applied and the filter width k is set as 3. It also states that hyper-parameter values are initialized according to baselines and discusses parameters like dropout rate, LSTM size, filter number, and batch size in a sensitivity analysis, but does not explicitly provide the specific values for these parameters used in the main experiments.