Language Model Priming for Cross-Lingual Event Extraction

Authors: Steven Fincke, Shantanu Agarwal, Scott Miller, Elizabeth Boschee10627-10635

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that by enabling the language model to better compensate for the deficits of sparse and noisy training data, our approach improves both trigger and argument detection and classification significantly over the state of the art in a zero-shot cross-lingual setting.
Researcher Affiliation Academia University of Southern California Information Sciences Institute
Pseudocode No The paper includes architectural diagrams (e.g., Figure 1, 2, 3, 4) but no explicit pseudocode blocks or algorithms.
Open Source Code No The paper does not provide a link or explicit statement about releasing its own source code. It mentions using third-party tools like spaCy, UDPipe, and Farasa, and a codebase for data splits from DyGIE++.
Open Datasets Yes We report results in two experimental settings, both using the ACE 2005 corpus (English and Arabic)6. 6https://www.ldc.upenn.edu/collaborations/past-projects/ace
Dataset Splits Yes Our primary experimental setting uses the standard English document train/dev/test splits for this dataset (Yang and Mitchell 2016) and the Arabic splits proposed by Xu et al. (2021).
Hardware Specification No The paper mentions using specific language models like "BERT (Devlin et al. 2019)" and "XLM-RoBERTa (Conneau et al. 2020)", but it does not specify any hardware details (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies No The paper mentions several software components like "torchcrf", "spaCy", "UDPipe", and "Farasa", but it does not provide specific version numbers for these dependencies, which are required for full reproducibility. It links to documentation or cites papers for some of these tools, but explicit versions are absent.
Experiment Setup Yes All models fine tune all the layers of the language model and only use the output from the final layer. ... All results reported in this paper other than Table 2 use this experimental setting and are the average of five seeds. ... For language models we use the large, cased version of BERT (Devlin et al. 2019) for the monolingual English condition and the large version of XLM-Ro BERTa (Conneau et al. 2020) for cross-lingual or Arabic-only conditions.