Foundations of Declarative Data Analysis Using Limit Datalog Programs

Authors: Mark Kaminski, Bernardo Cuenca Grau, Egor V. Kostylev, Boris Motik, Ian Horrocks

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Motivated by applications in declarative data analysis, we study Datalog Z an extension of positive Datalog with arithmetic functions over integers. This language is known to be undecidable, so we propose two fragments. In limit Datalog Z predicates are axiomatised to keep minimal/maximal numeric values, allowing us to show that fact entailment is CONEXPTIME-complete in combined, and CONP-complete in data complexity. Moreover, an additional stability requirement causes the complexity to drop to EXPTIME and PTIME, respectively. Finally, we show that stable Datalog Z can express many useful data analysis tasks, and so our results provide a sound foundation for the development of advanced information systems.
Researcher Affiliation Academia Department of Computer Science, University of Oxford, UK
Pseudocode Yes Algorithm 1 Entailment for Semi-Ground Stable Programs
Open Source Code No The paper does not provide an unambiguous statement or a link to open-source code for the methodology it describes.
Open Datasets No The paper discusses example datasets (Dtw, Dcp, Dbcp) for illustrative purposes but does not provide concrete access information (link, DOI, repository, or formal citation) for any publicly available dataset used for empirical evaluation.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology). The paper is theoretical and does not report on empirical data evaluation.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts) used for running any computational experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate any computational aspects.
Experiment Setup No The paper does not contain specific experimental setup details, such as concrete hyperparameter values, training configurations, or system-level settings. This is a theoretical paper that does not involve empirical experiments requiring such details.