reproducibilityindex.ai

Global Distant Supervision for Relation Extraction

Authors: Xianpei Han, Le Sun

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that, by exploiting the consistency between relation labels, the consistency between relations and arguments, and the consistency between neighbor instances using Markov logic, our method significantly outperforms traditional DS approaches. We test our model on a publicly available data set. Experimental results show that our method significantly outperforms traditional DS methods.
Researcher Affiliation	Academia	State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences {xianpei, sunle}@nfs.iscas.ac.cn
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using 'Stanford’s MIMLRE package (Surdeanu et al., 2012), which is open source' for baselines, but does not provide concrete access to source code for the methodology described in this paper.
Open Datasets	Yes	We evaluate our method on a publicly available data set KBP, which was developed by Surdeanu et al. (2012).
Dataset Splits	Yes	This paper tunes and tests different methods use the same partitions and the same evaluation method as Surdeanu et al. (2012). We tune our global distant supervision model using the validation partition of KBP.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions algorithms used (PSCG, Sample SAT, Max Walk SAT) and a third-party package (Stanford’s MIMLRE) but does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	We tune our global distant supervision model using the validation partition of KBP. After tuning for different MLN models, we used PSCG algorithm (5 samples, 10~20 iterations, step length 0.03) and Sample SAT inference algorithm (5,000,000 ~ 10,000,000 flips with 20% noise flips, 30% random ascent flips, and 50% SA flips) for learning. Because positive/negative instances are highly imbalanced in the training corpus, we put a higher misclassification cost (the tuned value is 2.0) to positive instances. For the KNNOf evidence predicates, we use 10 nearest neighbors for each instance (with similarity > 0.2).