ARIA: Asymmetry Resistant Instance Alignment
Authors: Sanghoon Lee, Seung-won Hwang
AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental Evaluation Settings Evaluations were conducted on an Intel quad-core i7 3.6GHz CPU with 32 GB RAM equipped with Java 7. Alignment accuracy was measured by precision and recall. To evaluate blocking quality, we used reduction ratio (RR) and pair completeness (PC) RR is the ratio of pruned instance pairs among all possible pairs, and PC is the ratio of true matches for all pairs. We encoded the identifiers (e.g., URIs) of instances, relations, and concepts to avoid cheating by using URI text as alignment clues. For datasets, we used DBpedia (Lehmann et al. 2014) and YAGO (Biega, Kuzey, and Suchanek 2013), which are realworld large-scale KBs that cover millions of instances. |
| Researcher Affiliation | Academia | Sanghoon Lee and Seung-won Hwang Pohang University of Science and Technology (POSTECH), Korea, Republic of {sanghoon, swhwang}@postech.edu |
| Pseudocode | Yes | Algorithm 1 Block(IX, IY , TX, TY , k, t) |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | For datasets, we used DBpedia (Lehmann et al. 2014) and YAGO (Biega, Kuzey, and Suchanek 2013), which are realworld large-scale KBs that cover millions of instances. |
| Dataset Splits | No | The paper mentions using 'seed matches' as training data for learning concept correlations and refers to 'gold standards' and 'ground truth' for evaluation. However, it does not explicitly provide details about training/validation/test dataset splits (e.g., percentages, absolute counts, or predefined splits) for model training or hyperparameter tuning. |
| Hardware Specification | Yes | Evaluations were conducted on an Intel quad-core i7 3.6GHz CPU with 32 GB RAM equipped with Java 7. |
| Software Dependencies | Yes | equipped with Java 7 |
| Experiment Setup | Yes | Candidate degree threshold t was set to 10 in this experiment. Our blocking method showed near perfect reduction ratio (RR) in all domains (Table 4), which shows that the method has high effectiveness in reducing the search space for matching. Pair completeness (PC) is the upper bound to recall of alignment. PC was sufficiently high for the person and location domains, and ARIA achieved recall close to the bound obtained from PC. Note this bound is notably low for organizations due to feature sparsity, which explains low recalls of both ARIA and PARIS for this specific domain. Lastly, we evaluate the robustness of instance similarities between the result candidates of blocking methods for each domain (Table 5). We set triple similarity threshold θ as 0.8. |