reproducibilityindex.ai

Span-Based Semantic Role Labeling with Argument Pruning and Second-Order Inference

Authors: Zixia Jia, Zhaohui Yan, Haoyi Wu, Kewei Tu10822-10830

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments Experiment Settings We experiment on the CoNLL 2005 (Carreras and Márquez 2005) and CoNLL 2012 (Pradhan et al. 2012) English datasets following the official training-development-test split. We evaluate the performance of our model using official script on the micro-average F1 score for correctly predicting (predicate, argument span, label) tuples. We report results of two SRL settings. In the predicted predicates setting, our SRLN treats each word in a sentence as a candidate predicate and predicts the existence and label of each predicate-argument pair. In order to compare with previous work, in the gold predicates setting, our SRLN takes gold predicates as input and only needs to find semantic roles of gold predicates. We repeat each experiment three times and report the average results. ... Main Results Our model does not use any syntactic information, so we compare our results with previous syntax-agnostic neural models on the CoNLL 2005 (in-domain WSJ and out-of-domain Brown) and CoNLL 2012 test sets. The latest results of syntax-aware models are also listed for reference.
Researcher Affiliation	Academia	Zixia jia1,2,3,4, Zhaohui Yan1,2,3,4, Haoyi Wu1, Kewei Tu 1,2 1School of Information Science and Technology, Shanghai Tech University 2Shanghai Engineering Research Center of Intelligent Vision and Imaging 3Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences 4University of Chinese Academy of Sciences
Pseudocode	No	The paper describes iterative update steps and inference procedures, but does not include a formally labeled "Pseudocode" or "Algorithm" block or structured steps formatted like code.
Open Source Code	Yes	We provide our source code in https://github.com/JZXXX/Span-srl.
Open Datasets	Yes	We experiment on the CoNLL 2005 (Carreras and Márquez 2005) and CoNLL 2012 (Pradhan et al. 2012) English datasets following the official training-development-test split.
Dataset Splits	Yes	We experiment on the CoNLL 2005 (Carreras and Márquez 2005) and CoNLL 2012 (Pradhan et al. 2012) English datasets following the official training-development-test split.
Hardware Specification	Yes	As to the running time, our framework can finish training in no more than one day in all the cases, but the models of He et al. (2018) and Li et al. (2019) need more than 36 hours for training on the TITAN V GPU with their provided settings.
Software Dependencies	No	The paper mentions software components like "pretrained GloVe embeddings", "ELMo embeddings", "BERT", "RoBERTa", and "Adam and AMSGrad optimizer" but does not provide specific version numbers for any of these or for the core programming environment (e.g., Python, PyTorch).
Experiment Setup	Yes	We tune the hyper-parameter λ of PAPN between {0.8, 1.0, 1.5} and find λ does not have much effect on the results because the recall of our PAPN is high. The hyper-parameter a and b of span representation in SRLN are set to 560 and 40 respectively, following the proportion of a and b set by Seo et al. (2019). We do not tune these two hyper-parameters. ... We use the Adam and AMSGrad optimizer (Reddi, Kale, and Kumar 2018) to optimize our loss functions (Eq.1 and Eq.3). We tune the constant α in Eq.3 between {0.03, 0.05, 0.1, 0.2} with different experiment settings and keep the values of most hyper-parameters the same as in previous work (Dozat and Manning 2018).