reproducibilityindex.ai

Neural Relation Extraction within and across Sentence Boundaries

Authors: Pankaj Gupta, Subburam Rajaram, Hinrich Schütze, Thomas Runkler6513-6520

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our models on four datasets from newswire (MUC6) and medical (Bio NLP shared task) domains that achieve state-of-the-art performance and show a better balance in precision and recall for inter-sentential relationships. We perform better than 11 teams participating in the Bio NLP shared task 2016 and achieve a gain of 5.2% (0.587 vs 0.558) in F1 over the winning team.
Researcher Affiliation	Collaboration	Pankaj Gupta,1,2 Subburam Rajaram,1 Hinrich Sch utze,2 Thomas Runkler1 1Corporate Technology, Machine-Intelligence (MIC-DE), Siemens AG Munich, Germany 2CIS, University of Munich (LMU) Munich, Germany {pankaj.gupta, subburam.rajaram}@siemens.com \| pankaj.gupta@campus.lmu.de
Pseudocode	No	No explicit pseudocode or algorithm blocks (e.g., labeled 'Pseudocode' or 'Algorithm') are present in the paper. The methodology is described using text and mathematical equations.
Open Source Code	Yes	Code, data and supplementary are available at https://github.com/pgcool/ Cross-sentence-Relation-Extraction-i Dep NN.
Open Datasets	Yes	We evaluate our proposed methods on four datasets from medical and news domain. The three medical domain datasets are taken from the Bio NLP shared task (ST) of relation/event extraction (Bossy et al. 2011; N edellec et al. 2013; Del eger et al. 2016). [...] The MUC6 (Grishman and Sundheim 1996) dataset contains information about management succession events from newswire.
Dataset Splits	Yes	We have standard train/dev/test splits for the Bio NLP ST 2016 dataset, while we perform 3-fold crossvalidation1 on Bio NLP ST 2011 and 2013. For Bio NLP ST 2016, we generate negative examples by randomly sampling co-occurring entities without known interactions. Then we sample the same number as positives to obtain a balanced dataset during training and validation for different sentence range. [...] We randomly split the collection 60/20/20 into train/dev/test.
Hardware Specification	No	No specific hardware details (such as CPU/GPU models, memory, or cloud instance types) used for running experiments are explicitly mentioned in the paper.
Software Dependencies	No	The paper mentions 'Stanford Core NLP dependency parser', 'Network X', and 'Glo Ve embeddings', but does not provide specific version numbers for these or any other software dependencies, making it not reproducibly described.
Experiment Setup	Yes	For MUC6, we use the pretrained Glo Ve (Pennington, Socher, and Manning 2014) embeddings (200-dimension). For the Bio NLP datasets, we use 200-dimensional embedding2 vectors from six billion words of biomedical text (Moen and Ananiadou 2013). We randomly initialize a 5-dimensional vectors for PI and POS. We initialize the recurrent weight matrix to identity and biases to zero. We use the macro-averaged F1 score (the ofﬁcial evaluation script by Sem Eval-2010 Task 8 (Hendrickx et al. 2010)) on the development set to choose hyperparameters (see supplementary).