Enhancing Datalog Reasoning with Hypertree Decompositions
Authors: Xinyue Zhang, Pan Hu, Yavor Nenov, Ian Horrocks
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical evaluation shows that, when the program contains complex rules, the combined approach is usually significantly faster than the baseline approach, sometimes by orders of magnitude. |
| Researcher Affiliation | Collaboration | 1 Department of Computer Science, University of Oxford, Oxford, UK 2 School of Electrical Information and Electronic Engineering, Shanghai Jiao Tong University, China 3 Oxford Semantic Techonologies, Oxford, UK |
| Pseudocode | Yes | Algorithm 1 MAT(Π, E); Algorithm 2 DRed(Π, E, I, E+, E ); Algorithm 3 Addr[I, +]; Algorithm 4 Cross Node Evaluationr(L); Algorithm 5 Delr[I, ]; Algorithm 6 Redr[I, ] |
| Open Source Code | Yes | Our test system and data are available online.3 3https://xinyuezhang.xyz/HDReasoning/ |
| Open Datasets | Yes | We tested our system using the well-known LUBM and YAGO benchmarks [Guo et al., 2005; Suchanek et al., 2008] |
| Dataset Splits | Yes | We evaluate the running time of materialisation (the initial materialisation with all the explicitly given facts inserted as E+ in Algorithm 2), small deletions (randomly selecting 1,000 facts from the dataset as E in Algorithm 2), large deletions (randomly selecting 25% of the dataset as E ), and small additions (adding 1,000 facts as E+ into the dataset). |
| Hardware Specification | Yes | All of our experiments are conducted on a Dell Power Edge R730 server with 512GB RAM and 2 Intel Xeon E5-2640 2.60GHz processors, running Fedora 33, kernel version 5.10.8. |
| Software Dependencies | Yes | All of our experiments are conducted on a Dell Power Edge R730 server with 512GB RAM and 2 Intel Xeon E5-2640 2.60GHz processors, running Fedora 33, kernel version 5.10.8. |
| Experiment Setup | Yes | In our implementation, for (1), we use the standard textbook cardinality estimator described in Chapter 16.4 of the book [Garcia-Molina, 2008] to estimate the cardinality of Bi λ(p) Bi for a node p; for (2), we use 2 (| ˆpi| + | ˆpj|) to estimate the cost of performing semi-joins between nodes pi and pj, where | ˆpi| and | ˆpj| represent the estimated node size. Moreover, the extra step of full reducer we introduced in algorithm 4 (line 6) is more suitable for small updates, in which the node with the smallest size helps reduce other large nodes. If the size of all the nodes is comparable, then this step would be unnecessary. Therefore, in practice, we only perform this optimisation if the number of active instantiations in the node (i.e., pi) is more than three times smaller than the maximum number of active instantiations in each node. |