reproducibilityindex.ai

Universal Information Extraction as Unified Semantic Matching

Authors: Jie Lou, Yaojie Lu, Dai Dai, Wei Jia, Hongyu Lin, Xianpei Han, Le Sun, Hua Wu

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluation on 4 IE tasks shows that the proposed method achieves state-of-the-art performance under the supervised experiments and shows strong generalization ability in zero/few-shot transfer settings.
Researcher Affiliation	Collaboration	1Baidu Inc., Beijing, China 2Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing, China 3State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China {loujie, daidai, jiawei07, wu hua}@baidu.com {luyaojie, hongyu, xianpei, sunle}@iscas.ac.cn
Pseudocode	No	The paper includes diagrams (Figure 1, Figure 2, Figure 3) illustrating the framework and operations, but it does not present any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an unambiguous statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	We use Ontonotes (Pradhan et al. 2013), widely used in the field of information extraction as gold annotation, which contains 18 entity types. We employ NYT (Riedel et al. 2013) and Rebel (Huguet Cabot and Navigli 2021) as our distant supervision datasets... We utilize reading comprehension datasets from MRQA (Fisch et al. 2019) as our indirect supervision datasets: Hotpot QA (Yang et al. 2018), Natural Questions (Kwiatkowski et al. 2019), News QA (Trischler et al. 2017), SQu AD (Rajpurkar et al. 2016) and Trivia QA (Joshi et al. 2017).
Dataset Splits	Yes	For few-shot transfer experiments, we follow the data splits and settings with the previous work (Lu et al. 2022) and repeat each experiment 10 times to avoid the influence of random sampling (Huang et al. 2021).
Hardware Specification	No	The paper does not explicitly describe the specific hardware used (e.g., GPU models, CPU types, memory) for running its experiments.
Software Dependencies	No	The paper mentions using "Ro BERTa-Large (Liu et al. 2019)" as the pre-trained transformer encoder, but it does not specify any software dependencies with version numbers (e.g., Python version, specific deep learning frameworks like PyTorch or TensorFlow versions).
Experiment Setup	No	The paper states, "We employ the same end-to-end settings and evaluation metrics as Lu et al. (2022)." and "We run each experiment with three seeds and report their average performance." However, it refers to another paper for the specific settings and does not explicitly list hyperparameters or detailed training configurations in the main text of this paper.