MCL-NER: Cross-Lingual Named Entity Recognition via Multi-View Contrastive Learning
Authors: Ying Mo, Jian Yang, Jiahao Liu, Qifan Wang, Ruoyu Chen, Jingang Wang, Zhoujun Li
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on the XTREME benchmark, spanning 40 languages, demonstrate the superiority of MCL-NER over prior data-driven and model-based approaches. |
| Researcher Affiliation | Collaboration | 1State Key Lab of Software Development Environment, Beihang University, Beijing, China 2Meituan, Beijing, China 3Meta AI, New York, United States 4Beijing Information Science and Technology University, Beijing, China |
| Pseudocode | No | The paper describes the method using text and mathematical equations but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include an unambiguous statement or a direct link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | The proposed method is evaluated on the XTREME benchmark (Hu et al. 2020). Following the previous work (Hu et al. 2020), we use the same split for the train, validation, and test sets, including the LOC, PER, and ORG tags. All NER models make the English training data the source language and evaluate other languages data. We also run experiments on Co NLL-02 and Co NLL-03 datasets (Sang 2002; Sang and Meulder 2003) covering 4 languages: Spanish (es), Dutch (nl), English (en), and German (de). |
| Dataset Splits | Yes | Following the previous work (Hu et al. 2020), we use the same split for the train, validation, and test sets, including the LOC, PER, and ORG tags. All NER models make the English training data the source language and evaluate other languages data. We also run experiments on Co NLL-02 and Co NLL-03 datasets (Sang 2002; Sang and Meulder 2003) covering 4 languages: Spanish (es), Dutch (nl), English (en), and German (de). We split them into the train, validation, and test sets, following the prior work (Yang et al. 2022a). |
| Hardware Specification | No | The paper does not explicitly specify the hardware used for running experiments (e.g., specific GPU or CPU models, memory details). |
| Software Dependencies | No | The paper mentions optimizers and pre-trained models used but does not provide specific version numbers for general software dependencies such as programming languages or libraries like Python or PyTorch. |
| Experiment Setup | Yes | We set the batch size as 32 for XTREME-40 and Co NLL. We use Adam W (Loshchilov and Hutter 2019) for optimization with a learning rate of 1e 5 for the pre-trained model and 1e 3 for other extra components. The dimension of the projection representations for contrastive learning is set to 128. |