Parallel Materialisation of Datalog Programs in Centralised, Main-Memory RDF Systems
Authors: Boris Motik, Yavor Nenov, Robert Piro, Ian Horrocks, Dan Olteanu
AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical evaluation shows that our approach parallelises computation very well: with 16 physical cores, materialisation can be up to 13.9 times faster than with just one core. We have implemented our approach in a new system called RDFox and have evaluated its performance on several synthetic and real-world datasets. |
| Researcher Affiliation | Academia | Boris Motik, Yavor Nenov, Robert Piro, Ian Horrocks and Dan Olteanu Department of Computer Science, Oxford University Oxford, United Kingdom firstname.lastname@cs.ox.ac.uk |
| Pseudocode | Yes | Algorithm 1 Threads of the Materialisation Algorithm; Algorithm 2 I.nested Loops(Q, F, τ, j); Algorithm 3 add triple(s, p, o); Algorithm 4 insert sp list(Tnew, T). |
| Open Source Code | Yes | All datasets, test systems, scripts, and test results are available online.1 http://www.cs.ox.ac.uk/isg/tools/RDFox/tests/ |
| Open Datasets | Yes | All datasets, test systems, scripts, and test results are available online.1 http://www.cs.ox.ac.uk/isg/tools/RDFox/tests/ -- Table 1 summarises our test datasets. LUBM (Guo, Pan, and Heflin 2005) and UOBM (Ma et al. 2006) are synthetic datasets. |
| Dataset Splits | No | The paper does not explicitly specify training, validation, or test dataset splits or percentages for reproducing the experiments. |
| Hardware Specification | Yes | We tested RDFox on a Dell computer with 128 GB of RAM, 64-bit Red Hat Enterprise Linux Server 6.3 kernel version 2.6.32, and two Xeon E5-2650 processors with 16 physical cores, extended to 32 virtual cores via hyperthreading; we ran the comparison tests on another Dell computer with 128 GB of RAM, 64-bit Cent OS 6.4 kernel version 2.6.32, and two Xeon E5-2643 processors with 8/16 cores. |
| Software Dependencies | Yes | The system is written in C++, and it works on x86-64 computers running Windows, Linux, or Mac OS X. -- Postgre SQL (PG) 9.2 and Monet DB (MDB), Feb 2013-SP3 release. |
| Experiment Setup | Yes | RDFox was allowed to use at most 100 GB of RAM. We stored the databases of MDB and PG on a 100 GB RAM disk and allowed the rest to be used as the systems working memory; thus, PG and MDB could use more RAM in total than RDFox. On PG, we created all tables as UNLOGGED to eliminate the fault recovery overhead, and we configured the database to aggressively use memory using the following options: fsync = off, synchronous commit = off, full page writes = off, bgwriter lru pages = 0, shared buffers = 16GB, work mem = 16GB, effective cache size = 16GB. DBRDF can store RDF data using one of the two well-known storage schemes. In the vertical partitioning (VP) variant... In the triple table (TT) variant... |