Attention as Relation: Learning Supervised Multi-head Self-Attention for Relation Extraction
Authors: Jie Liu, Shaowei Chen, Bingquan Wang, Jiaxin Zhang, Na Li, Tong Xu
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To verify the effectiveness of our model, we conduct comprehensive experiments on two benchmark datasets. The experimental results demonstrate that our model achieves state-of-the-art performances. |
| Researcher Affiliation | Academia | 1College of Artificial Intelligence, Nankai University, Tianjin, China 2College of Computer Science, Nankai University, Tianjin, China 3University of Science and Technology of China, Hefei, China |
| Pseudocode | No | The paper presents mathematical formulations and a model framework diagram, but it does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | 1https://github.com/NKU-IIPLab/SMHSA |
| Open Datasets | Yes | To verify the effectiveness of our model, we conduct extensive experiments on two benchmark datasets, including New York Times (NYT) [Riedel et al., 2010] and Web NLG [Gardent et al., 2017]. |
| Dataset Splits | Yes | To construct the development set, we randomly select 10% samples from the training set. The statistics of the above datasets are shown in Table 1. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU types, or memory. |
| Software Dependencies | No | The paper mentions using "pre-trained Glove 840B vectors" and the "RMSprop optimizer," but it does not specify software dependencies with version numbers (e.g., Python version, specific deep learning framework version like PyTorch or TensorFlow). |
| Experiment Setup | Yes | The dimensions of hidden states for character LSTM, encoding layer, entity extraction module, and relation extraction module are set to 100, 600, 250, 250, respectively. ... The learning rate, learning rate decay, and batch size are set to 0.001, 0.95, and 10, respectively. To ensure the balance between entity extraction and relation detection, we adopt an iterative two-step training manner... To avoid overfitting, we apply dropout at a rate of 0.3. |