BERTMap: A BERT-Based Ontology Alignment System

Authors: Yuan He, Jiaoyan Chen, Denvar Antonyrajah, Ian Horrocks5684-5691

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluation with three alignment tasks on biomedical ontologies demonstrates that BERTMap can often perform better than the leading OM systems Log Map and AML.
Researcher Affiliation Collaboration 1 Department of Computer Science, University of Oxford, UK 2 Samsung Research, UK {yuan.he,jiaoyan.chen,ian.horrocks}@cs.ox.ac.uk, denvar.a@samsung.com
Pseudocode Yes Algorithm 1: Iterative Mapping Extension
Open Source Code Yes 1Codes and data: https://github.com/KRR-Oxford/BERTMap.
Open Datasets Yes The evaluation considers the FMA-SNOMED and FMA-NCI small fragment tasks of the OAEI Large Bio Track. They have large-scale ontologies and high quality gold standards created by domain experts.
Dataset Splits Yes In the unsupervised setting, we divide M= into Mval (10%) and Mtest (90%); and in the semi-supervised setting, we divide M= into Mtrain (20%), Mval (10%) and Mtest (70%).
Hardware Specification Yes The training uses a single GTX 1080Ti GPU.
Software Dependencies No The paper mentions that the implementation uses 'owlready2' and 'transformers' libraries but does not provide specific version numbers for them.
Experiment Setup Yes The BERT model is fine-tuned for 3 epochs with a batch size of 32, and evaluated on the validation set for every 0.1 epoch, through which the best checkpoint on the cross-entropy loss is selected for prediction. The cut-off of sub-word inverted index-based candidate selection is set to 200. Besides, we set the positive-negative sample ratio to 1 : 4.