Efficient Document-level Event Extraction via Pseudo-Trigger-aware Pruned Complete Graph
Authors: Tong Zhu, Xiaoye Qu, Wenliang Chen, Zhefeng Wang, Baoxing Huai, Nicholas Yuan, Min Zhang
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments results show that our PTPCG can reach competitive results with only 19.8% parameters of DAGbased SOTA models, just taking 3.8% GPU hours to train and up to 8.5 times faster in terms of inference. |
| Researcher Affiliation | Collaboration | 1Institute of Artificial Intelligence, School of Computer Science and Technology, Soochow University, China 2Huawei Cloud, China |
| Pseudocode | No | The paper describes algorithms (e.g., Bron-Kerbosch algorithm, non-autoregressive decoding algorithm) and illustrates concepts with figures, but it does not provide formal pseudocode blocks or algorithms labeled as such. |
| Open Source Code | Yes | Codes are available at https: //github.com/Spico197/Doc EE . |
| Open Datasets | Yes | We use Ch Fin Ann [Zheng et al., 2019] and Du EE-fin [Li, 2021] datasets to make fair comparisons across all methods. |
| Dataset Splits | Yes | We choose the hyper-parameters of our system according to the performance on the development set in Ch Fin Ann. |
| Hardware Specification | Yes | Figure 4: Inference speed comparison with baselines (left) and with different |R| (right) with 1 NVIDIA V100 GPU for all models. |
| Software Dependencies | No | The paper mentions software components like 'Bi LSTM' and 'Adam optimizer' but does not specify version numbers for general software dependencies (e.g., Python, PyTorch/TensorFlow, specific library versions). |
| Experiment Setup | Yes | In PTPCG, we use 2 layers of shared Bi LSTM for event detection and entity extraction, and another 2 layers of Bi LSTM for entity encoding. We use the same vocabulary as [Zheng et al., 2019] and randomly initialize all the embeddings where dh=768 and dl=32. Adam [Kingma and Ba, 2015] optimizer is used with a learning rate of 5e-4 and the mini-batch size is 64. The weights in Equation 5 are 0.05, 1.0, 1.0, and 1.0, and γ in Equation 3 is 0.5. Following the setting in Zheng et al. [2019], we train our models for 100 epochs and select the checkpoint with the best F1 score on the development set to evaluate on the test set. |