reproducibilityindex.ai

Span-Based Event Coreference Resolution

Authors: Jing Lu, Vincent Ng13489-13497

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results on the standard evaluation dataset provide afﬁrmative answers to all three questions.
Researcher Affiliation	Academia	Jing Lu and Vincent Ng Human Language Technology Research Institute University of Texas at Dallas Richardson, TX 75083-0688 {ljwinnie,vince}@hlt.utdallas.edu
Pseudocode	No	The paper describes model structures and mathematical functions, and includes architectural diagrams, but it does not contain structured pseudocode or algorithm blocks explicitly labeled as such.
Open Source Code	No	The paper does not provide a direct link or explicit statement for the availability of the source code for the methodology described in the paper. It links to Span BERT (https://github.com/facebookresearch/Span BERT), which is a third-party tool they utilized, not their own implementation code.
Open Datasets	Yes	We employ the English corpora made available to us as part of the TAC KBP 2017 Event Nugget Detection and Coreference task. For training, we use LDC2015E29, E68, E73, E94 and LDC2016E72.
Dataset Splits	Yes	Train Dev Test #docs 735 82 167 #event mentions 20458 2436 4375 #event chains 12988 1806 2963 #entity mentions 43450 8161 13860 #entity chains 15094 3180 5482 Table 1: Dataset statistics.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, memory) used to run its experiments. It mentions using 'Span BERT-large' but not the underlying hardware.
Software Dependencies	No	The paper mentions 'Span BERT-large' and 'Stanford Core NLP toolkit (Manning et al. 2014)' but does not provide specific version numbers for general ancillary software dependencies (e.g., Python, PyTorch, CUDA, scikit-learn).
Experiment Setup	Yes	Implementation details. We use Span BERT-large in the span representation layer. We split each document into segments of length 512 and generate all spans of length up to 10. Each FFNN has 1 hidden layer of size 3000. The size of the width feature embedding is 20. For span pruning, we keep the top 30% of the spans. For candidate antecedent pruning, we keep the top 20 antecedents. For training, we use document sized mini-batches. We apply a dropout rate of 0.3. Following Joshi et al. (2019), we use different learning rates for training the task parameters and the Span BERT parameters. Speciﬁcally, the task learning rate is 1 10 5 and is decayed linearly, whereas the learning rate for Span BERT is 2 10 4 and is decayed linearly.