Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Effectiveness of Automatic Translations for Cross-Lingual Ontology Mapping
Authors: Mamoun Abu Helou, Matteo Palmonari, Mustafa Jarrar
JAIR 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper we present a large-scale study on the effectiveness of automatic translations to support two key cross-lingual ontology mapping tasks: the retrieval of candidate matches and the selection of the correct matches for inclusion in the final alignment. We conduct our experiments using four different large gold standards, each one consisting of a pair of mapped wordnets, to cover four different families of languages. |
| Researcher Affiliation | Academia | Mamoun Abu Helou EMAIL Department of Informatics, Systems and Communication University of Milan-Bicocca Matteo Palmonari EMAIL Department of Informatics, Systems and Communication University of Milan-Bicocca Mustafa Jarrar EMAIL Department of Computer Science Birzeit University |
| Pseudocode | No | The paper describes mathematical definitions for evaluation measures and conceptual steps for translation tasks and mapping selection, but it does not present any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement that the authors' own implementation code for the described methodology is being released, nor does it provide a direct link to a code repository. It mentions using existing tools like Google Translate and Babel Net, but not providing code for their specific study. |
| Open Datasets | Yes | As gold standards, we use cross-lingual mappings manually established (or validated) by lexicographers between four wordnets (Arabic, Italian, Slovene and Spanish) and the English Word Net. Footnote 11: The Arabic, Italian, and Slovene wordnets are obtained from OMWN (2015), and the Spanish wordnet is obtained from MCR (2012). |
| Dataset Splits | No | The paper uses pre-existing wordnets as gold standards for evaluation. It classifies concepts within these wordnets (e.g., monosemous, polysemous, synonymless, synonymful) and evaluates translation effectiveness against these categories. However, it does not describe specific training, validation, or test splits of these datasets for machine learning or experimental reproduction in the typical sense. |
| Hardware Specification | No | The paper describes the experimental setup and methodology, but it does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to conduct the experiments. |
| Software Dependencies | Yes | In our study, we use two multilingual lexical resources as sources of translations: Google Translate (2015) and Babel Net1 (Navigli & Ponzetto, 2012). Footnote 1: We used Babel Net version 2.5. |
| Experiment Setup | Yes | In Section 2, we introduce some preliminary definitions used in the rest of the paper. In Section 3, we overview related work... The evaluation measures and the multilingual lexical resources used in our study to obtain translations are presented respectively in sections 4 and 5. In section 6, we present the experiments. Section 4 defines specific measures like Translation Correctness (Eq. 5), Word Sense Coverage (Eq. 6), Synset Coverage (Eq. 7), and Synonym Coverage (Eq. 8), which are central to the experimental setup. Section 6.1 describes the experimental setup including importing wordnets into a database and compiling bilingual dictionaries with Google Translate API and Babel Net. |