Neural Relation Extraction within and across Sentence Boundaries
Authors: Pankaj Gupta, Subburam Rajaram, Hinrich Schütze, Thomas Runkler6513-6520
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our models on four datasets from newswire (MUC6) and medical (Bio NLP shared task) domains that achieve state-of-the-art performance and show a better balance in precision and recall for inter-sentential relationships. We perform better than 11 teams participating in the Bio NLP shared task 2016 and achieve a gain of 5.2% (0.587 vs 0.558) in F1 over the winning team. |
| Researcher Affiliation | Collaboration | Pankaj Gupta,1,2 Subburam Rajaram,1 Hinrich Sch utze,2 Thomas Runkler1 1Corporate Technology, Machine-Intelligence (MIC-DE), Siemens AG Munich, Germany 2CIS, University of Munich (LMU) Munich, Germany {pankaj.gupta, subburam.rajaram}@siemens.com | pankaj.gupta@campus.lmu.de |
| Pseudocode | No | No explicit pseudocode or algorithm blocks (e.g., labeled 'Pseudocode' or 'Algorithm') are present in the paper. The methodology is described using text and mathematical equations. |
| Open Source Code | Yes | Code, data and supplementary are available at https://github.com/pgcool/ Cross-sentence-Relation-Extraction-i Dep NN. |
| Open Datasets | Yes | We evaluate our proposed methods on four datasets from medical and news domain. The three medical domain datasets are taken from the Bio NLP shared task (ST) of relation/event extraction (Bossy et al. 2011; N edellec et al. 2013; Del eger et al. 2016). [...] The MUC6 (Grishman and Sundheim 1996) dataset contains information about management succession events from newswire. |
| Dataset Splits | Yes | We have standard train/dev/test splits for the Bio NLP ST 2016 dataset, while we perform 3-fold crossvalidation1 on Bio NLP ST 2011 and 2013. For Bio NLP ST 2016, we generate negative examples by randomly sampling co-occurring entities without known interactions. Then we sample the same number as positives to obtain a balanced dataset during training and validation for different sentence range. [...] We randomly split the collection 60/20/20 into train/dev/test. |
| Hardware Specification | No | No specific hardware details (such as CPU/GPU models, memory, or cloud instance types) used for running experiments are explicitly mentioned in the paper. |
| Software Dependencies | No | The paper mentions 'Stanford Core NLP dependency parser', 'Network X', and 'Glo Ve embeddings', but does not provide specific version numbers for these or any other software dependencies, making it not reproducibly described. |
| Experiment Setup | Yes | For MUC6, we use the pretrained Glo Ve (Pennington, Socher, and Manning 2014) embeddings (200-dimension). For the Bio NLP datasets, we use 200-dimensional embedding2 vectors from six billion words of biomedical text (Moen and Ananiadou 2013). We randomly initialize a 5-dimensional vectors for PI and POS. We initialize the recurrent weight matrix to identity and biases to zero. We use the macro-averaged F1 score (the official evaluation script by Sem Eval-2010 Task 8 (Hendrickx et al. 2010)) on the development set to choose hyperparameters (see supplementary). |