Universal Information Extraction as Unified Semantic Matching
Authors: Jie Lou, Yaojie Lu, Dai Dai, Wei Jia, Hongyu Lin, Xianpei Han, Le Sun, Hua Wu
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluation on 4 IE tasks shows that the proposed method achieves state-of-the-art performance under the supervised experiments and shows strong generalization ability in zero/few-shot transfer settings. |
| Researcher Affiliation | Collaboration | 1Baidu Inc., Beijing, China 2Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing, China 3State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China {loujie, daidai, jiawei07, wu hua}@baidu.com {luyaojie, hongyu, xianpei, sunle}@iscas.ac.cn |
| Pseudocode | No | The paper includes diagrams (Figure 1, Figure 2, Figure 3) illustrating the framework and operations, but it does not present any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an unambiguous statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | We use Ontonotes (Pradhan et al. 2013), widely used in the field of information extraction as gold annotation, which contains 18 entity types. We employ NYT (Riedel et al. 2013) and Rebel (Huguet Cabot and Navigli 2021) as our distant supervision datasets... We utilize reading comprehension datasets from MRQA (Fisch et al. 2019) as our indirect supervision datasets: Hotpot QA (Yang et al. 2018), Natural Questions (Kwiatkowski et al. 2019), News QA (Trischler et al. 2017), SQu AD (Rajpurkar et al. 2016) and Trivia QA (Joshi et al. 2017). |
| Dataset Splits | Yes | For few-shot transfer experiments, we follow the data splits and settings with the previous work (Lu et al. 2022) and repeat each experiment 10 times to avoid the influence of random sampling (Huang et al. 2021). |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used (e.g., GPU models, CPU types, memory) for running its experiments. |
| Software Dependencies | No | The paper mentions using "Ro BERTa-Large (Liu et al. 2019)" as the pre-trained transformer encoder, but it does not specify any software dependencies with version numbers (e.g., Python version, specific deep learning frameworks like PyTorch or TensorFlow versions). |
| Experiment Setup | No | The paper states, "We employ the same end-to-end settings and evaluation metrics as Lu et al. (2022)." and "We run each experiment with three seeds and report their average performance." However, it refers to another paper for the specific settings and does not explicitly list hyperparameters or detailed training configurations in the main text of this paper. |