A Simple Yet Effective Subsequence-Enhanced Approach for Cross-Domain NER
Authors: Jinpeng Hu, DanDan Guo, Yang Liu, Zhuo Li, Zhihong Chen, Xiang Wan, Tsung-Hui Chang
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on several benchmark datasets illustrate the effectiveness of our model, which achieves considerable improvements. |
| Researcher Affiliation | Academia | 1Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, Guangdong, China 2The Chinese University of Hong Kong, Shenzhen 3Pazhou Lab, Guangzhou, 510330, China {jinpenghu, zhuoli3, zhihongchen, yangliu5}@link.cuhk.edu.cn, {guodandan, changtsunghui}@cuhk.edu.cn, wanxiang@sribd.cn |
| Pseudocode | No | The paper describes algorithms and mathematical formulations but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper provides a link to the BERT model they used ('https://github.com/google-research/bert') but does not provide a link or explicit statement about releasing the source code for their own proposed methodology. |
| Open Datasets | Yes | To validate the effectiveness of our proposed model, we employ the following datasets in our experiments. We regard the Conll2003 as the source domain and other datasets as the target domains. Conll2003 (Sang and De Meulder 2003), MIT Movie (Movie) (Liu et al. 2013b), Cross NER (Liu et al. 2021b), MIT Restaurant (Restaurant) (Liu et al. 2013a). |
| Dataset Splits | Yes | We follow the official split for these datasets, with their statistics summarized in Table 1. Dataset TRAIN VAL TEST #Ent.T #SENT. #ENT. #AVE.E #SENT. #ENT. #AVE.E #SENT. #ENT. #AVE.E |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models or processor types used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'bert-base-cased' as a pre-trained language model and 'Adam' for optimization, but it does not specify version numbers for these or other software dependencies (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | In our experiments, we use the standard BIO scheme to label NEs. We utilize a pre-trained language model (i.e., bert-base-cased) as the text encoder... We follow their default model setting: we use 12 layers of self-attention with 768-dimensional embeddings. Besides, the hidden size in BMRU is set to 768 for each direction with its parameters initialized randomly. We use Adam (Kingma and Ba 2015) to optimize all trainable parameters... k is set to 7 in our experiments. More detailed hyperparameters are reported in Appendix. For transfer training, we first train the model on the source domain data with 2 epochs for Conll2003 and then fine-tune the model to the target domain. |