reproducibilityindex.ai

Beyond Word Attention: Using Segment Attention in Neural Relation Extraction

Authors: Bowen Yu, Zhenyu Zhang, Tingwen Liu, Bin Wang, Sujian Li, Quangang Li

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments are conducted on the TACRED dataset. Results show that our model achieves the state-of-the-art performance on the fully-supervised RE task. We conduct qualitative analyses to understand how our model works with the help of segment attention, including evaluation of the extracted relational expressions.
Researcher Affiliation	Collaboration	1Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China 2School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China 3Xiaomi AI Lab, Xiaomi Inc., Beijing, China 4Key Laboratory of Computational Linguistics, Peking University, MOE, China
Pseudocode	No	The paper describes algorithmic steps and equations (e.g., for CRF and forward-backward algorithm) but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	The source code of this paper can be obtained from https://github.com/yubowen-ph/segment.
Open Datasets	Yes	We conduct experiments on the recently widely used benchmark TACRED dataset introduced in [Zhang et al., 2017], which is the currently largest supervised dataset for relation extraction.
Dataset Splits	Yes	For fair comparisons, we report the test score of the run with the median validation score among 5 randomly initialized runs following the evaluation protocol used in [Zhang et al., 2017]. All the hyper-parameters are tuned on the validation set.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies	No	The paper mentions software like 'Stanford Core NLP' and 'Glove embeddings' but does not provide specific version numbers for these or other key software dependencies.
Experiment Setup	Yes	Dropout with p = 0.5 used after the input layer and before the classiﬁer layer. λ1 and λ2 are chosen from [0,0.2] via grid search. For LSTM, we set the hidden dimension size to 300 and use 2-layer stacked Bi LSTM. The model is trained using stochastic gradient descent for 30 epochs with the initial learning rate of 1 and the weight decay of 0.5.