Distilling Knowledge from Well-Informed Soft Labels for Neural Relation Extraction
Authors: Zhenyu Zhang, Xiaobo Shu, Bowen Yu, Tingwen Liu, Jiapeng Zhao, Quangang Li, Li Guo9620-9627
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on the TACRED and Sem Eval datasets, the experimental results justify the effectiveness of our approach. |
| Researcher Affiliation | Academia | Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China {zhangzhenyu1996, shuxiaobo, yubowen, liutingwen, zhaojiapeng, liquangang, guoli}@iie.ac.cn |
| Pseudocode | No | The paper includes architectural diagrams (Figure 2) and describes methods in text, but it does not provide any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code can be obtained from https://github.com/zzysay/KD4NRE. |
| Open Datasets | Yes | We conduct experiments on two widely used benchmark datasets: (1) TACRED (Zhang et al. 2017)... (2) Sem Eval (Hendrickx et al. 2010)... |
| Dataset Splits | Yes | Table 1: Statisticses of the TACRED and Sem Eval datasets. #Train #Dev #Test |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU specifications, or memory used for the experiments. It only implies that models were trained and experiments performed. |
| Software Dependencies | No | The paper mentions using 'Glo Ve (Pennington, Socher, and Manning 2014) vectors' and the 'Stanford Core NLP toolkit' but does not specify the version numbers for these software components, which is necessary for reproducibility. |
| Experiment Setup | Yes | We set the NA probability C to 0.2, the temperature of knowledge distillation τ to 1, the weight factor of hint learning λht to 1.8, the weight factor of type constraints in Teacher-S γs to 0.8. The size of position embedding dp and NER tag embedding dn in MAA are both set to 30. Inspired by Clark et al. (2019), we adopt the teacher annealing strategy: Let λkd increases from 0 to 1 linearly throughout the training stage of student. |