Unsupervised Knowledge Graph Alignment by Probabilistic Reasoning and Semantic Embedding

Authors: Zhiyuan Qi, Ziheng Zhang, Jiaoyan Chen, Xi Chen, Yuejia Xiang, Ningyu Zhang, Yefeng Zheng

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The PRASE framework has been evaluated on five widely used datasets and one industry dataset. The results show the state-of-the-art performance of PRASE.
Researcher Affiliation Collaboration Zhiyuan Qi1 , Ziheng Zhang1 , Jiaoyan Chen2 , Xi Chen1,3 , Yuejia Xiang1,3 , Ningyu Zhang4 and Yefeng Zheng1 1Tencent Jarvis Lab, Shenzhen, China 2Department of Computer Science, University of Oxford, UK 3Platform and Content Group, Tencent, Shenzhen, China 4Zhejiang University, Hangzhou, China
Pseudocode Yes Algorithm 1 PARIS-based PRASE Implementation
Open Source Code Yes This section presents the evaluation of PRASE, and the code is available at https://github.com/qizhyuan/PRASE-Python.
Open Datasets Yes Open EA Datasets: The Open EA datasets2 are constructed based on DBpedia, YAGO, and Wikidata [Sun et al., 2020]. We use all their large-scale datasets of the version V2 that has more complex KG structures. They include two crosslingual datasets (i.e., EN-FR-100K-V2 and EN-DE-100KV2) and two cross-KG datasets (i.e., D-W-100K-V2 and DY-100K-V2). We also use a small dataset D-W-15K-V2, a relatively difficult dataset as reported by [Sun et al., 2020]. In the following, the annotation -V2 is omitted. Industry Dataset: MED-BBK-9K is an industry dataset proposed by [Zhang et al., 2020], which is built from an authoritative medical KG and a KG extracted from Baidu Baike, a Chinese online encyclopedia. 2https://github.com/nju-websoft/Open EA
Dataset Splits Yes We adopt the implementations of the embedding-based models from Open EA [Sun et al., 2020] with the same dataset division: 20%, 10%, and 70% of the entity mappings for training, validation, and testing, respectively.
Hardware Specification Yes Our experiments are conducted on a workstation with an Intel Xeon E5 CPU and an NVIDIA Tesla M40 GPU.
Software Dependencies No The paper mentions implementing PARIS in Python and that the original PARIS was in Java, but does not provide specific version numbers for Python, Java, or any other software dependencies or libraries.
Experiment Setup Yes We set α1 = α2 = 1, β = 0.8, δ1 = δ2 = δf = 0.1, and choose cosine similarity as sim( ). Since a small value of K is found to be sufficient for PRASE to demonstrate its effectiveness, we set K = 1 in the experiments unless specified.