reproducibilityindex.ai

Computing the Why-Provenance for Datalog Queries via SAT Solvers

Authors: Marco Calautti, Ester Livshits, Andreas Pieris, Markus Schneider

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we experimentally conﬁrm the efﬁciency of our approach with a dedicated benchmark, and we further show that, in general, it outperforms the machinery proposed in (Elhalawati, Kr otzsch, and Mennicke 2022). Experimental Evaluation
Researcher Affiliation	Academia	1Department of Computer Science, University of Milan 2School of Informatics, University of Edinburgh 3Department of Computer Science, University of Cyprus
Pseudocode	No	The paper describes methods and processes conceptually and mathematically, but does not provide pseudocode or a clearly labeled algorithm block.
Open Source Code	Yes	The experimental scenarios and the source code are given at https://gitlab.com/aaai24whyprov/datalog-why-provenance.
Open Datasets	No	The paper mentions database names like Dbitcoin, Dfacebook, Doctors-i, Galen, Andersen, and CSDA in Table 1, stating they are 'from the Datalog literature' but does not provide specific links, DOIs, or formal citations for public access to these datasets themselves.
Dataset Splits	No	The paper evaluates the performance of the why-provenance computation on different databases and selected tuples, but it does not specify training, validation, or test dataset splits in the context of machine learning model training.
Hardware Specification	Yes	All the experiments have been conducted on a laptop with an Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz, and 32GB of RAM, running Fedora Linux 37.
Software Dependencies	Yes	We used Python 3.11.2, and the C++ code has been compiled with g++ 12.2.1, using the -O3 optimization ﬂag. To construct the downward closure we can exploit a state-of-the-art Datalog engine, that is, version 2.1.1 of DLV (Adrian et al. 2018). Finally, we ran the state-of-the-art SAT solver Glucose (see, e.g., (Audemard and Simon 2018)), version 4.2.1.
Experiment Setup	Yes	For each database of the scenario, we report a box-plot collecting all the runtimes relative to the 100 tuples associated to that database. Then, for each i [100], we constructed the downward closure of R( ti s,D) w.r.t. D and Σ by ﬁrst computing the adapted query Q and database D via a Python 3 implementation and then using DLV for the actual computation of the downward closure. Then, we constructed the Boolean formula φ( ti s,D,D,Q) via a C++ implementation. Finally, we ran the state-of-the-art SAT solver Glucose... for our implementation, we employ the technique of vertex elimination (Rankooh and Rintanen 2022). We set a 5 minutes timeout for both approaches.