Consistent Inference for Dialogue Relation Extraction

Authors: Xinwei Long, Shuzi Niu, Yucheng Li

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on two benchmark datasets show that the F1 performance improvement of the proposed method is at least 3.3% compared with SOTA. We conduct comprehensive experiments on two benchmark datasets, Dialog RE [Yu et al., 2020] and MPDD [Chen et al., 2020b], and Co In shows the 3.3% and 6.2% improvement in terms of F1 (Dialog RE) and accuracy (MPDD) than state-of-the-art models. Ablation studies prove the effectiveness of each module.
Researcher Affiliation Academia 1Institute of Software, Chinese Academy of Sciences 2University of Chinese Academy of Sciences longxinwei19@mails.ucas.ac.cn, {shuzi, yucheng}@iscas.ac.com,
Pseudocode No The paper includes an architecture diagram (Figure 2) but does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Source codes and pre-processed data are released in https://github.com/xinwei96/Co In dialog RE
Open Datasets Yes Datasets. (1) Dialog RE [Yu et al., 2020]. We follow the standard settings offered by the original paper, and deploy F1 score as the metric. (2) MPDD [Chen et al., 2020b]. More details of Dialog RE and processed MPDD can be found in Table 1. Source codes and pre-processed data are released in https://github.com/xinwei96/Co In dialog RE
Dataset Splits Yes Table 1: Dataset Statistics. Dialog Num. 1073 / 358 / 357, Relation Num. 4992 / 1597 / 1529. (These numbers represent train / dev / test splits, where 'dev' typically serves as the validation set).
Hardware Specification Yes Experiments are conducted on a sever with a Ge Force GTX 1080Ti GPU, 64G memory.
Software Dependencies Yes Our model was implemented by Pytorch with CUDA 11.0.
Experiment Setup Yes We adopt BERT-base architecture with the fine-tuning learning rate of 2e 5. We use a self-attention layer with dropout 0.2 and learning rate 5e 4. The number of windows K is set to 2 from {i}4 i=1. We use Adam W [Loshchilov and Hutter, 2019] as optimizer with Cosine Annealing scheduler [Loshchilov and Hutter, 2017]. The threshold τ of multi-label classifier, Trade-off parameters λ1 and λ2 are set to 0.51.