reproducibilityindex.ai

ARIA: Asymmetry Resistant Instance Alignment

Authors: Sanghoon Lee, Seung-won Hwang

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental Evaluation Settings Evaluations were conducted on an Intel quad-core i7 3.6GHz CPU with 32 GB RAM equipped with Java 7. Alignment accuracy was measured by precision and recall. To evaluate blocking quality, we used reduction ratio (RR) and pair completeness (PC) RR is the ratio of pruned instance pairs among all possible pairs, and PC is the ratio of true matches for all pairs. We encoded the identiﬁers (e.g., URIs) of instances, relations, and concepts to avoid cheating by using URI text as alignment clues. For datasets, we used DBpedia (Lehmann et al. 2014) and YAGO (Biega, Kuzey, and Suchanek 2013), which are realworld large-scale KBs that cover millions of instances.
Researcher Affiliation	Academia	Sanghoon Lee and Seung-won Hwang Pohang University of Science and Technology (POSTECH), Korea, Republic of {sanghoon, swhwang}@postech.edu
Pseudocode	Yes	Algorithm 1 Block(IX, IY , TX, TY , k, t)
Open Source Code	No	The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	For datasets, we used DBpedia (Lehmann et al. 2014) and YAGO (Biega, Kuzey, and Suchanek 2013), which are realworld large-scale KBs that cover millions of instances.
Dataset Splits	No	The paper mentions using 'seed matches' as training data for learning concept correlations and refers to 'gold standards' and 'ground truth' for evaluation. However, it does not explicitly provide details about training/validation/test dataset splits (e.g., percentages, absolute counts, or predefined splits) for model training or hyperparameter tuning.
Hardware Specification	Yes	Evaluations were conducted on an Intel quad-core i7 3.6GHz CPU with 32 GB RAM equipped with Java 7.
Software Dependencies	Yes	equipped with Java 7
Experiment Setup	Yes	Candidate degree threshold t was set to 10 in this experiment. Our blocking method showed near perfect reduction ratio (RR) in all domains (Table 4), which shows that the method has high effectiveness in reducing the search space for matching. Pair completeness (PC) is the upper bound to recall of alignment. PC was sufﬁciently high for the person and location domains, and ARIA achieved recall close to the bound obtained from PC. Note this bound is notably low for organizations due to feature sparsity, which explains low recalls of both ARIA and PARIS for this speciﬁc domain. Lastly, we evaluate the robustness of instance similarities between the result candidates of blocking methods for each domain (Table 5). We set triple similarity threshold θ as 0.8.