SEE: Syntax-Aware Entity Embedding for Neural Relation Extraction

Authors: Zhengqiu He, Wenliang Chen, Zhenghua Li, Meishan Zhang, Wei Zhang, Min Zhang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on a widely used real-world dataset and the experimental results show that our model can make full use of all informative instances and achieve stateof-the-art performance of relation extraction.
Researcher Affiliation Collaboration 1School of Computer Science and Technology, Soochow University, China 2Alibaba Group, China 3School of Computer Science and Technology, Heilongjiang University, China
Pseudocode No The paper describes methods in prose and with diagrams (Figure 2, Figure 3, Figure 4) but does not include any structured pseudocode or algorithm blocks.
Open Source Code No No mention or link to open-source code for the methodology.
Open Datasets Yes We adopt the benchmark dataset developed by Riedel, Yao, and Mc Callum (2010), which has been widely used in many recent works (Hoffmann et al. 2011; Surdeanu et al. 2012; Lin et al. 2016; Ji et al. 2017). Riedel, Yao, and Mc Callum (2010) use Freebase as the distant supervision source and the three-year NYT corpus from 2005 to 2007 as the text corpus.
Dataset Splits No We tune the hyper-parameters of all the baseline and our proposed models on the training dataset using three-fold validation.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments (e.g., GPU/CPU models, memory, or cloud instance types).
Software Dependencies Yes First, we employ the off-shelf Stanford Parser5 to parse the New York Times (NYT) corpus (Klein and Manning 2003). and 5https://nlp.stanford.edu/software/lex-parser.shtml, and the version is 3.7.0
Experiment Setup Yes We try {0.1, 0.15, 0.2, 0.25} for the initial learning rate of SGD, {50, 100, 150, 200} for the mini-batch size of SGD, {50, 80, 100} for both the word and the dependency embedding dimensions, {5, 10, 20} for the position embedding dimension, {3, 5, 7} for the convolution window size l, and {60, 120, 180, 240, 300} for the filter number K. We find the configuration 0.2/150/50/50/5/3/240 works well for all the models, and further tuning leads to slight improvement.