MCL-NER: Cross-Lingual Named Entity Recognition via Multi-View Contrastive Learning

Authors: Ying Mo, Jian Yang, Jiahao Liu, Qifan Wang, Ruoyu Chen, Jingang Wang, Zhoujun Li

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on the XTREME benchmark, spanning 40 languages, demonstrate the superiority of MCL-NER over prior data-driven and model-based approaches.
Researcher Affiliation Collaboration 1State Key Lab of Software Development Environment, Beihang University, Beijing, China 2Meituan, Beijing, China 3Meta AI, New York, United States 4Beijing Information Science and Technology University, Beijing, China
Pseudocode No The paper describes the method using text and mathematical equations but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not include an unambiguous statement or a direct link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes The proposed method is evaluated on the XTREME benchmark (Hu et al. 2020). Following the previous work (Hu et al. 2020), we use the same split for the train, validation, and test sets, including the LOC, PER, and ORG tags. All NER models make the English training data the source language and evaluate other languages data. We also run experiments on Co NLL-02 and Co NLL-03 datasets (Sang 2002; Sang and Meulder 2003) covering 4 languages: Spanish (es), Dutch (nl), English (en), and German (de).
Dataset Splits Yes Following the previous work (Hu et al. 2020), we use the same split for the train, validation, and test sets, including the LOC, PER, and ORG tags. All NER models make the English training data the source language and evaluate other languages data. We also run experiments on Co NLL-02 and Co NLL-03 datasets (Sang 2002; Sang and Meulder 2003) covering 4 languages: Spanish (es), Dutch (nl), English (en), and German (de). We split them into the train, validation, and test sets, following the prior work (Yang et al. 2022a).
Hardware Specification No The paper does not explicitly specify the hardware used for running experiments (e.g., specific GPU or CPU models, memory details).
Software Dependencies No The paper mentions optimizers and pre-trained models used but does not provide specific version numbers for general software dependencies such as programming languages or libraries like Python or PyTorch.
Experiment Setup Yes We set the batch size as 32 for XTREME-40 and Co NLL. We use Adam W (Loshchilov and Hutter 2019) for optimization with a learning rate of 1e 5 for the pre-trained model and 1e 3 for other extra components. The dimension of the projection representations for contrastive learning is set to 128.