Exploring Self-Distillation Based Relational Reasoning Training for Document-Level Relation Extraction
Authors: Liang Zhang, Jinsong Su, Zijun Min, Zhongjian Miao, Qingguo Hu, Biao Fu, Xiaodong Shi, Yidong Chen
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we conduct comprehensive experiments on three benchmark datasets, of which experimental results demonstrate that our model consistently outperforms all competitive baselines. |
| Researcher Affiliation | Academia | 1School of Informatics, Xiamen University, China 2Key Laboratory of Digital Protection and Intelligent Processing of Intangible Cultural Heritage of Fujian and Taiwan (Xiamen University), Ministry of Culture and Tourism, China lzhang@stu.xmu.edu.cn, {jssu,ydchen}@xmu.edu.cn |
| Pseudocode | No | The paper describes the model architecture and training process in narrative text and mathematical formulas, but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our source code is available at https://github.com/Deep Learn XMU/Doc RE-SD. |
| Open Datasets | Yes | We evaluate our model on three commonly-used datasets: Doc RED (Yao et al. 2019). It is a large-scale humanannotated dataset for document-level RE, which is constructed from Wikipedia and Wikidata. ... CDR (Li et al. 2016). It is a biomedical dataset and consists of 1,500 Pub Med abstracts, which are equally divided into three sets for training, development, and testing. ... GDA (Wu et al. 2019). This dataset is a large-scale biomedical one, which is constructed from MEDLINE abstracts by method of distant supervision. |
| Dataset Splits | Yes | We follow the standard split of the dataset, 3,053 documents for training, 1,000 for development and, 1,000 for the test. ... CDR (Li et al. 2016). It is a biomedical dataset and consists of 1,500 Pub Med abstracts, which are equally divided into three sets for training, development, and testing. ... We follow Tang et al. (2020) to divide the training set into two parts, 23,353 documents for training and 5,839 for development. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for the experiments. It only mentions the use of pre-trained models (BERT-base, RoBERTa-large, SciBERT-base) and the PyTorch framework. |
| Software Dependencies | No | Using Py Torch, we develop our model based on Huggingface s Transformers (Wolf et al. 2020). |
| Experiment Setup | Yes | Using Py Torch, we develop our model based on Huggingface s Transformers (Wolf et al. 2020). We use BERT-base (Devlin et al. 2019) or Ro BERTa-large (Liu et al. 2019) as the encoder on Doc RED, and Sci BERT-base (Beltagy, Lo, and Cohan 2019) on CDR and GDA. We employ Adam W (Loshchilov and Hutter 2019) to optimize our model with a linear warmup (Goyal et al. 2017) for the first 6% steps. We empirically set the layer number L of reasoning module to 2. We apply dropout (Srivastava et al. 2014) between layers with rate 0.1, and clip the gradients of model parameters to a maximal norm of 1.0. All hyper-parameters are tuned on the development set. |