Document-level Relation Extraction as Semantic Segmentation

Authors: Ningyu Zhang, Xiang Chen, Xin Xie, Shumin Deng, Chuanqi Tan, Mosha Chen, Fei Huang, Luo Si, Huajun Chen

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that our approach can obtain state-of-the-art performance on three benchmark datasets Doc RED, CDR, and GDA1. 4 Experiments
Researcher Affiliation Collaboration Ningyu Zhang1,2 , Xiang Chen 1,2 , Xin Xie1,2 , Shumin Deng1,2 , Chuanqi Tan3 , Mosha Chen3 , Fei Huang3 , Luo Si3 , Huajun Chen1,2 1 Zhejiang University & AZFT Joint Lab for Knowledge Engine 2 Hangzhou Innovation Center, Zhejiang University 3 Alibaba Group {zhangningyu,xiang chen,xx2020,231sm,huajunsir}@zju.edu.cn {chuanqi.tcq,chenmosha.cms,f.huang,luo.si}@alibaba-inc.com
Pseudocode No The paper describes the methodology in text but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes 1The code and datasets are available in https://github.com/zjunlp/ Docu Net.
Open Datasets Yes We evaluated our Docu Net model on three document-level RE datasets. ... Doc RED [Yao et al., 2019] is a large-scale documentlevel relation extraction dataset by crowdsourcing. ... CDR [Li et al., 2016] is a relation extraction dataset in the biomedical domain... GDA [Wu et al., 2019] is a dataset in the biomedical domain...
Dataset Splits Yes Doc RED contains 3,053/1,000/1,000 instances for training, validating and test, respectively. We listed the dataset statistics in Table 1.
Hardware Specification Yes We trained on one NVIDIA V100 16GB GPU and evaluated our model with Ign F1, and F1 following [Yao et al., 2019].
Software Dependencies No Our model was implemented based on Pytorch. We used cased BERT-base, or Ro BERTa-large as the encoder on Doc RED and Sci BERT-base [Beltagy et al., 2019] on CDR and GDA. We optimize our model with Adam W using learning rates 2e 5 with a linear warmup for the first 6% of steps.
Experiment Setup Yes We optimize our model with Adam W using learning rates 2e 5 with a linear warmup for the first 6% of steps. We set the matrix size N = 42. The context-based strategy is utilized by default.