Language Model Priming for Cross-Lingual Event Extraction
Authors: Steven Fincke, Shantanu Agarwal, Scott Miller, Elizabeth Boschee10627-10635
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that by enabling the language model to better compensate for the deficits of sparse and noisy training data, our approach improves both trigger and argument detection and classification significantly over the state of the art in a zero-shot cross-lingual setting. |
| Researcher Affiliation | Academia | University of Southern California Information Sciences Institute |
| Pseudocode | No | The paper includes architectural diagrams (e.g., Figure 1, 2, 3, 4) but no explicit pseudocode blocks or algorithms. |
| Open Source Code | No | The paper does not provide a link or explicit statement about releasing its own source code. It mentions using third-party tools like spaCy, UDPipe, and Farasa, and a codebase for data splits from DyGIE++. |
| Open Datasets | Yes | We report results in two experimental settings, both using the ACE 2005 corpus (English and Arabic)6. 6https://www.ldc.upenn.edu/collaborations/past-projects/ace |
| Dataset Splits | Yes | Our primary experimental setting uses the standard English document train/dev/test splits for this dataset (Yang and Mitchell 2016) and the Arabic splits proposed by Xu et al. (2021). |
| Hardware Specification | No | The paper mentions using specific language models like "BERT (Devlin et al. 2019)" and "XLM-RoBERTa (Conneau et al. 2020)", but it does not specify any hardware details (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions several software components like "torchcrf", "spaCy", "UDPipe", and "Farasa", but it does not provide specific version numbers for these dependencies, which are required for full reproducibility. It links to documentation or cites papers for some of these tools, but explicit versions are absent. |
| Experiment Setup | Yes | All models fine tune all the layers of the language model and only use the output from the final layer. ... All results reported in this paper other than Table 2 use this experimental setting and are the average of five seeds. ... For language models we use the large, cased version of BERT (Devlin et al. 2019) for the monolingual English condition and the large version of XLM-Ro BERTa (Conneau et al. 2020) for cross-lingual or Arabic-only conditions. |