Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
BERTMap: A BERT-Based Ontology Alignment System
Authors: Yuan He, Jiaoyan Chen, Denvar Antonyrajah, Ian Horrocks5684-5691
AAAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our evaluation with three alignment tasks on biomedical ontologies demonstrates that BERTMap can often perform better than the leading OM systems Log Map and AML. |
| Researcher Affiliation | Collaboration | 1 Department of Computer Science, University of Oxford, UK 2 Samsung Research, UK EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Iterative Mapping Extension |
| Open Source Code | Yes | 1Codes and data: https://github.com/KRR-Oxford/BERTMap. |
| Open Datasets | Yes | The evaluation considers the FMA-SNOMED and FMA-NCI small fragment tasks of the OAEI Large Bio Track. They have large-scale ontologies and high quality gold standards created by domain experts. |
| Dataset Splits | Yes | In the unsupervised setting, we divide M= into Mval (10%) and Mtest (90%); and in the semi-supervised setting, we divide M= into Mtrain (20%), Mval (10%) and Mtest (70%). |
| Hardware Specification | Yes | The training uses a single GTX 1080Ti GPU. |
| Software Dependencies | No | The paper mentions that the implementation uses 'owlready2' and 'transformers' libraries but does not provide specific version numbers for them. |
| Experiment Setup | Yes | The BERT model is ο¬ne-tuned for 3 epochs with a batch size of 32, and evaluated on the validation set for every 0.1 epoch, through which the best checkpoint on the cross-entropy loss is selected for prediction. The cut-off of sub-word inverted index-based candidate selection is set to 200. Besides, we set the positive-negative sample ratio to 1 : 4. |