Efficient Document-level Event Extraction via Pseudo-Trigger-aware Pruned Complete Graph

Authors: Tong Zhu, Xiaoye Qu, Wenliang Chen, Zhefeng Wang, Baoxing Huai, Nicholas Yuan, Min Zhang

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experiments results show that our PTPCG can reach competitive results with only 19.8% parameters of DAGbased SOTA models, just taking 3.8% GPU hours to train and up to 8.5 times faster in terms of inference.
Researcher Affiliation Collaboration 1Institute of Artificial Intelligence, School of Computer Science and Technology, Soochow University, China 2Huawei Cloud, China
Pseudocode No The paper describes algorithms (e.g., Bron-Kerbosch algorithm, non-autoregressive decoding algorithm) and illustrates concepts with figures, but it does not provide formal pseudocode blocks or algorithms labeled as such.
Open Source Code Yes Codes are available at https: //github.com/Spico197/Doc EE .
Open Datasets Yes We use Ch Fin Ann [Zheng et al., 2019] and Du EE-fin [Li, 2021] datasets to make fair comparisons across all methods.
Dataset Splits Yes We choose the hyper-parameters of our system according to the performance on the development set in Ch Fin Ann.
Hardware Specification Yes Figure 4: Inference speed comparison with baselines (left) and with different |R| (right) with 1 NVIDIA V100 GPU for all models.
Software Dependencies No The paper mentions software components like 'Bi LSTM' and 'Adam optimizer' but does not specify version numbers for general software dependencies (e.g., Python, PyTorch/TensorFlow, specific library versions).
Experiment Setup Yes In PTPCG, we use 2 layers of shared Bi LSTM for event detection and entity extraction, and another 2 layers of Bi LSTM for entity encoding. We use the same vocabulary as [Zheng et al., 2019] and randomly initialize all the embeddings where dh=768 and dl=32. Adam [Kingma and Ba, 2015] optimizer is used with a learning rate of 5e-4 and the mini-batch size is 64. The weights in Equation 5 are 0.05, 1.0, 1.0, and 1.0, and γ in Equation 3 is 0.5. Following the setting in Zheng et al. [2019], we train our models for 100 epochs and select the checkpoint with the best F1 score on the development set to evaluate on the test set.