Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Entity Alignment with Reliable Path Reasoning and Relation-aware Heterogeneous Graph Transformer

Authors: Weishan Cai, Wenjun Ma, Jieyu Zhan, Yuncheng Jiang

IJCAI 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we evaluate the performance of RPR-RHGT on three widely used benchmark datasets. The code is now available at https://github.com/cwswork/RPR-RHGT. [...] Extensive experiments on three well-known datasets show RPR-RHGT signiﬁcantly outperforms 10 state-of-the-art methods, exceeding the best performing baseline up to 8.62% on Hits@1.
Researcher Affiliation	Academia	1School of Computer Science, South China Normal University, China 2School of Computer and Information Engineering, Hanshan Normal University, China 3School of Artiﬁcial Intelligence, South China Normal University, China EMAIL, EMAIL, zhanjieyu,EMAIL
Pseudocode	Yes	Algorithm 1 Procedure of RPR Algorithm.
Open Source Code	Yes	The code is now available at https://github.com/cwswork/RPR-RHGT.
Open Datasets	Yes	Three experimental datasets contain crosslingual datasets and mono-lingual dataset: WK31-15K [Sun et al., 2020b] is from multi-lingual DBpedia and used to evaluate model performance on sparse and dense datasets, where each subset contains two versions: V1 is sparse set obtained by using IDS algorithm, and V2 is twice as dense as V1. DBP-15K [Sun et al., 2017] is the most used dataset in the literature, and is also from DBpedia. DWY-100K [Sun et al., 2018] contains two mono-lingual KGs, which serve as large-scale datasets to better evaluate the scalability of experimental models.
Dataset Splits	Yes	For WK31-15K and DBP-15K, the proportion of train, validation and test is 2:1:7, the same as [Sun et al., 2020b]. For DWY-100K, we adopt the same train (30%) / test (70%) split as baselines.
Hardware Specification	Yes	The results running on a workstation with CPU (EPYC 3975WX +256G RAM) and GPU (RTX A4000 with 16G) are shown in Table 4, which shows large differences between different methods.
Software Dependencies	No	We use fast Text 1 to generate entity name embeddings that are uniformly applied to baseline recurrence, including RDGCN, NMN, RAGA, Multi KE and COTSAE. 1https://fasttext.cc/docs/en/crawl-vectors.html
Experiment Setup	Yes	For all datasets, we use the same weight hyper-parameters: τ sim = 0.5, τ path = 20, hn =4, γ1 =γ2 =10, θ = 0.3. The embedding dimensions of 15K and 100K datasets are 300 and 200, respectively.