Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

How Long Will It Take? Accurate Prediction of Ontology Reasoning Performance

Authors: Yong-Bin Kang, Jeff Z. Pan, Shonali Krishnaswamy, Wudhichart Sawangphol, Yuan-Fang Li

AAAI 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our large-scale experiments on 6 state-of-the-art OWL 2 DL reasoners and more than 450 signiﬁcantly diverse ontologies demonstrate that the prediction models achieve high accuracy, good generalizability and statistical signiﬁcance.
Researcher Affiliation	Academia	Yong-Bin Kang Monash University, Australia EMAIL Jeff Z. Pan University of Aberdeen, UK EMAIL Shonali Krishnaswamy Inst. for Infocomm Research, Singapore EMAIL Wudhichart Sawangphol Monash University, Australia EMAIL Yuan-Fang Li Monash University, Australia EMAIL
Pseudocode	Yes	Algorithm 1: Regression-based performance hotspot identiﬁcation.
Open Source Code	No	The paper states that "The ontologies and the prediction models are available at http://bit.ly/1hSTy87", but it does not explicitly mention that the source code for the methodology is provided.
Open Datasets	Yes	451 real-world, public-domain ontologies are collected, some of which from the Tones Ontology Repository and the Bio Ontology repository. ... http://owl.cs.manchester.ac.uk/repository/, http://www.bioontology.org/
Dataset Splits	Yes	Lastly, the dataset of each reasoner is divided up into a training set and a test set in a 80/20 split. The training set is used for training the regression model (with 10-fold cross-validation)... Stratiﬁed sampling is performed with data points divided into 5 equal percentile groups on the response variable (reasoning time).
Hardware Specification	Yes	All experiments are conducted on a high-performance server running OS Linux 2.6.18 and Java 1.6 on an Intel Xeon X7560 CPU at 2.27GHz. A maximum of 32GB memory is allocated to each of the 6 reasoners to accommodate potential memory leak in reasoners from repeated invocations.
Software Dependencies	Yes	6 state-of-the-art OWL 2 DL reasoners are selected for the experiment: Fa CT++ (version 1.5.3), Hermi T (version 1.3.6), JFact (version 0.9), MORe (version 0.1.6, with Hermi T as the underlying OWL 2 DL reasoner), Pellet (version 2.2.0) and Tr OWL (version 0.8).
Experiment Setup	Yes	Standard 10-fold cross-validation is performed to ensure the generalizalibity of the model. ... Stratiﬁed sampling is performed with data points divided into 5 equal percentile groups on the response variable (reasoning time). ... A maximum of 32GB memory is allocated to each of the 6 reasoners... We also apply a 20,000-second timeout... In our experiments we set the ratio threshold to 10% of the number of logical axioms in the ontology and l to 6. ... k is set to 1,000 in our experiments.