Unsupervised Knowledge Graph Alignment by Probabilistic Reasoning and Semantic Embedding
Authors: Zhiyuan Qi, Ziheng Zhang, Jiaoyan Chen, Xi Chen, Yuejia Xiang, Ningyu Zhang, Yefeng Zheng
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The PRASE framework has been evaluated on five widely used datasets and one industry dataset. The results show the state-of-the-art performance of PRASE. |
| Researcher Affiliation | Collaboration | Zhiyuan Qi1 , Ziheng Zhang1 , Jiaoyan Chen2 , Xi Chen1,3 , Yuejia Xiang1,3 , Ningyu Zhang4 and Yefeng Zheng1 1Tencent Jarvis Lab, Shenzhen, China 2Department of Computer Science, University of Oxford, UK 3Platform and Content Group, Tencent, Shenzhen, China 4Zhejiang University, Hangzhou, China |
| Pseudocode | Yes | Algorithm 1 PARIS-based PRASE Implementation |
| Open Source Code | Yes | This section presents the evaluation of PRASE, and the code is available at https://github.com/qizhyuan/PRASE-Python. |
| Open Datasets | Yes | Open EA Datasets: The Open EA datasets2 are constructed based on DBpedia, YAGO, and Wikidata [Sun et al., 2020]. We use all their large-scale datasets of the version V2 that has more complex KG structures. They include two crosslingual datasets (i.e., EN-FR-100K-V2 and EN-DE-100KV2) and two cross-KG datasets (i.e., D-W-100K-V2 and DY-100K-V2). We also use a small dataset D-W-15K-V2, a relatively difficult dataset as reported by [Sun et al., 2020]. In the following, the annotation -V2 is omitted. Industry Dataset: MED-BBK-9K is an industry dataset proposed by [Zhang et al., 2020], which is built from an authoritative medical KG and a KG extracted from Baidu Baike, a Chinese online encyclopedia. 2https://github.com/nju-websoft/Open EA |
| Dataset Splits | Yes | We adopt the implementations of the embedding-based models from Open EA [Sun et al., 2020] with the same dataset division: 20%, 10%, and 70% of the entity mappings for training, validation, and testing, respectively. |
| Hardware Specification | Yes | Our experiments are conducted on a workstation with an Intel Xeon E5 CPU and an NVIDIA Tesla M40 GPU. |
| Software Dependencies | No | The paper mentions implementing PARIS in Python and that the original PARIS was in Java, but does not provide specific version numbers for Python, Java, or any other software dependencies or libraries. |
| Experiment Setup | Yes | We set α1 = α2 = 1, β = 0.8, δ1 = δ2 = δf = 0.1, and choose cosine similarity as sim( ). Since a small value of K is found to be sufficient for PRASE to demonstrate its effectiveness, we set K = 1 in the experiments unless specified. |