Entity Alignment with Reliable Path Reasoning and Relation-aware Heterogeneous Graph Transformer
Authors: Weishan Cai, Wenjun Ma, Jieyu Zhan, Yuncheng Jiang
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate the performance of RPR-RHGT on three widely used benchmark datasets. The code is now available at https://github.com/cwswork/RPR-RHGT. [...] Extensive experiments on three well-known datasets show RPR-RHGT significantly outperforms 10 state-of-the-art methods, exceeding the best performing baseline up to 8.62% on Hits@1. |
| Researcher Affiliation | Academia | 1School of Computer Science, South China Normal University, China 2School of Computer and Information Engineering, Hanshan Normal University, China 3School of Artificial Intelligence, South China Normal University, China caiws@m.scnu.edu.cn, phoenixsam@sina.com, zhanjieyu,ycjiang@scnu.edu.cn |
| Pseudocode | Yes | Algorithm 1 Procedure of RPR Algorithm. |
| Open Source Code | Yes | The code is now available at https://github.com/cwswork/RPR-RHGT. |
| Open Datasets | Yes | Three experimental datasets contain crosslingual datasets and mono-lingual dataset: WK31-15K [Sun et al., 2020b] is from multi-lingual DBpedia and used to evaluate model performance on sparse and dense datasets, where each subset contains two versions: V1 is sparse set obtained by using IDS algorithm, and V2 is twice as dense as V1. DBP-15K [Sun et al., 2017] is the most used dataset in the literature, and is also from DBpedia. DWY-100K [Sun et al., 2018] contains two mono-lingual KGs, which serve as large-scale datasets to better evaluate the scalability of experimental models. |
| Dataset Splits | Yes | For WK31-15K and DBP-15K, the proportion of train, validation and test is 2:1:7, the same as [Sun et al., 2020b]. For DWY-100K, we adopt the same train (30%) / test (70%) split as baselines. |
| Hardware Specification | Yes | The results running on a workstation with CPU (EPYC 3975WX +256G RAM) and GPU (RTX A4000 with 16G) are shown in Table 4, which shows large differences between different methods. |
| Software Dependencies | No | We use fast Text 1 to generate entity name embeddings that are uniformly applied to baseline recurrence, including RDGCN, NMN, RAGA, Multi KE and COTSAE. 1https://fasttext.cc/docs/en/crawl-vectors.html |
| Experiment Setup | Yes | For all datasets, we use the same weight hyper-parameters: τ sim = 0.5, τ path = 20, hn =4, γ1 =γ2 =10, θ = 0.3. The embedding dimensions of 15K and 100K datasets are 300 and 200, respectively. |