Computing the Why-Provenance for Datalog Queries via SAT Solvers
Authors: Marco Calautti, Ester Livshits, Andreas Pieris, Markus Schneider
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we experimentally confirm the efficiency of our approach with a dedicated benchmark, and we further show that, in general, it outperforms the machinery proposed in (Elhalawati, Kr otzsch, and Mennicke 2022). Experimental Evaluation |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Milan 2School of Informatics, University of Edinburgh 3Department of Computer Science, University of Cyprus |
| Pseudocode | No | The paper describes methods and processes conceptually and mathematically, but does not provide pseudocode or a clearly labeled algorithm block. |
| Open Source Code | Yes | The experimental scenarios and the source code are given at https://gitlab.com/aaai24whyprov/datalog-why-provenance. |
| Open Datasets | No | The paper mentions database names like Dbitcoin, Dfacebook, Doctors-i, Galen, Andersen, and CSDA in Table 1, stating they are 'from the Datalog literature' but does not provide specific links, DOIs, or formal citations for public access to these datasets themselves. |
| Dataset Splits | No | The paper evaluates the performance of the why-provenance computation on different databases and selected tuples, but it does not specify training, validation, or test dataset splits in the context of machine learning model training. |
| Hardware Specification | Yes | All the experiments have been conducted on a laptop with an Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz, and 32GB of RAM, running Fedora Linux 37. |
| Software Dependencies | Yes | We used Python 3.11.2, and the C++ code has been compiled with g++ 12.2.1, using the -O3 optimization flag. To construct the downward closure we can exploit a state-of-the-art Datalog engine, that is, version 2.1.1 of DLV (Adrian et al. 2018). Finally, we ran the state-of-the-art SAT solver Glucose (see, e.g., (Audemard and Simon 2018)), version 4.2.1. |
| Experiment Setup | Yes | For each database of the scenario, we report a box-plot collecting all the runtimes relative to the 100 tuples associated to that database. Then, for each i [100], we constructed the downward closure of R( ti s,D) w.r.t. D and Σ by first computing the adapted query Q and database D via a Python 3 implementation and then using DLV for the actual computation of the downward closure. Then, we constructed the Boolean formula φ( ti s,D,D,Q) via a C++ implementation. Finally, we ran the state-of-the-art SAT solver Glucose... for our implementation, we employ the technique of vertex elimination (Rankooh and Rintanen 2022). We set a 5 minutes timeout for both approaches. |