Span-based Unified Named Entity Recognition Framework via Contrastive Learning
Authors: Hongli Mao, Xian-Ling Mao, Hanlin Tang, Yu-Ming Shang, Xiaoyan Gao, Ao-Jie Ma, Heyan Huang
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on both supervised and zero/few-shot settings demonstrate that proposed SUNER model achieves better performance and higher efficiency than previous state-of-the-art unified NER models. |
| Researcher Affiliation | Academia | 1 School of Computer Science & Technology, Beijing Institute of Technology, Beijing, China 2 Beijing University of Posts and Telecommunications, Beijing, China 3 Beijing University of Technology, Beijing, China |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or provide a link to a code repository. |
| Open Datasets | Yes | We train and evaluate our model on seven existing public NER benchmarks including diverse domains such as news, biomedicine, movie, and restaurant, etc. The used datasets include three nested NER datasets: ACE 20041, ACE 20052 and GENIA [Kim et al., 2003]; along with four flat NER datasets: Co NLL 2003 [Sang and De Meulder, 2003], Onto Notes 53, MIT Restaurant and MIT Movie [Liu and Lane, 2017]. |
| Dataset Splits | Yes | We use MIT Restaurant and MIT Movie datasets with standard train, dev, and test splits, while adopting the splits of Yu et al. [2020] for the remaining datasets. |
| Hardware Specification | Yes | All experiments are conducted on a single GeForce RTX 3090 with the same setting. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | In the span detection module, we set the auxiliary loss weight λ as 0.7, the biaffine encoder hidden size is 300. The filtering thresholds θ1 and θ2 are set to 0.5 and 0.4, respectively. During training, all parameters are optimized using Adam with a peak learning rate of 1.5e-5, while Hyper-parameter tuning is performed based on validation set. |