Query Rewriting for Ontology-Mediated Conditional Answers

Authors: Medina Andresel, Magdalena Ortiz, Mantas Simkus2734-2741

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical Evaluation To demonstrate the potential usefulness of our approach, we developed a prototype implementation of the AOMQ rewriting. It was done in Java using Apache Jena 2.11 and Jena ARQ as SPARQL query engine, and tested on a Mac Book Pro i5 2.7, Sierra OS. We used the ontology, data, and a data generation tool from the My ITS project (Eiter, Krennwallner, and Schneider 2013; Eiter et al. 2015). The tool creates ABoxes with assertions for spatial relations like loc Next out of Open Street Map data, using parameters such as distance to create large sets of facts, in addition to other local data (e.g., crowd-sourced restaurant data). Then one can pose queries that need both parts of data, as well as ontological reasoning, to get answers (e.g., hotels in residential areas close to a subway station). Given the geospatial querying example in the introduction, My ITS is an relevant test case. Indeed, instead of creating large ABoxes, we may want to keep the access to spatial data remote. To simulate this scenario, we extracted some spatial relations and out-sourced their access via a SPARQL endpoint (using Jena Fuseki). This resulted in two sources: the local datasets with 227634 RDF triples, and a remote one with more than 2 million triples. We created 5 AOMQs based on test queries of Eiter et al. (2015), and treated spatial atoms as assumption patterns. In this way, we can query the local datasets and verify in the remote access point whether the spatial relations hold, only for the relevant candidates. Table 1 shows for these queries the sizes of rewritings w.r.t. T , and (T , H), and the size of cansmin, which gives a bound on the number of remote tests. We evaluated the time needed to answer the full rewriting over the local dataset, the time to construct the set cansmin, and the time to test remotely the spatial atoms (using SPARQL ask queries). The results show that evaluating Rew(Q) and constructing cansmin(Q) is very efficient, while testing the assumptions remotely was more expensive, as expected. In practice, this delay may be amortized in many cases, e.g., if many queries share remote tests. As a sanity check, we compared the total time needed by our approach to posing a federated SPARQL query (W3C 2013) using both data sources. The latter approach was slower, even despite the fact that we disregarded ontological reasoning; naively posing the result of TBox rewriting as a federated query seems infeasible.
Researcher Affiliation Academia Medina Andres el, Magdalena Ortiz, Mantas ˇSimkus {andresel, ortiz}@kr.tuwien.ac.at, simkus@dbai.tuwien.ac.at Institute of Logic and Computation, TU Wien, Austria
Pseudocode No The paper describes algorithms in prose and uses formal definitions and lemmas, but it does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code No The paper mentions developing a prototype implementation but does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets Yes We used the ontology, data, and a data generation tool from the My ITS project (Eiter, Krennwallner, and Schneider 2013; Eiter et al. 2015).
Dataset Splits No The paper describes the total size of local and remote datasets but does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning.
Hardware Specification Yes It was done in Java using Apache Jena 2.11 and Jena ARQ as SPARQL query engine, and tested on a Mac Book Pro i5 2.7, Sierra OS.
Software Dependencies Yes It was done in Java using Apache Jena 2.11 and Jena ARQ as SPARQL query engine, and tested on a Mac Book Pro i5 2.7, Sierra OS.
Experiment Setup No The paper describes the setup of data sources (local and remote) and types of queries, but it does not provide specific experimental setup details (concrete hyperparameter values, training configurations, or system-level settings) in the main text.