Zero-Shot Cross-Lingual Event Argument Extraction with Language-Oriented Prefix-Tuning
Authors: Pengfei Cao, Zhuoran Jin, Yubo Chen, Kang Liu, Jun Zhao
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that our proposed method achieves the best performance, outperforming the previous state-of-the-art model by 4.8% and 2.3% of the average F1-score on two multilingual EAE datasets. |
| Researcher Affiliation | Collaboration | 1National Laboratory of Pattern Recognition, Institute of Automation, CAS, Beijing, China 2School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China 3Beijing Academy of Artificial Intelligence, Beijing, 100084, China |
| Pseudocode | No | The paper describes the proposed method in detail but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statement about releasing the source code for the proposed method, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We evaluate our method on two EE datasets, including ACE2005 (Doddington et al. 2004) and ERE (Song et al. 2015). |
| Dataset Splits | Yes | For a fair comparison with previous work (Huang et al. 2022), we use the same dataset split and prepossessing methods to keep 33 event types and 22 argument roles. |
| Hardware Specification | Yes | Each experiment is conducted on NVIDIA RTX A6000 GPUs. |
| Software Dependencies | No | In our implementations, our method uses the Hugging Face s Transformers library3 to implement the encoder-decoder m T5 (base and large) model. We first use a pretrained universal dependency parser (e.g., Stanza1 (Qi et al. 2020)) to obtain the language-universal dependency tree of the input sentence. However, specific version numbers for these libraries are not provided. |
| Experiment Setup | Yes | The learning rate is initialized as 3e-5 or 1e-4 with a linear decay for m T5-base or m T5-large models, respectively. We utilize the Adam W algorithm (Loshchilov and Hutter 2017) to optimize model parameters. The batch size is set to 8. Our method generates output sequences by using beam search, whose beam size is set to 4. The length of prefix L is set to 30. The distance hyper-parameter δ is set to 2. The number of training epochs is 100. |