reproducibilityindex.ai

Column-Oriented Datalog Materialization for Large Knowledge Graphs

Authors: Jacopo Urbani, Ceriel Jacobs, Markus Krötzsch

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical evaluation shows that this approach can often match or even surpass the performance of state-of-the-art systems, especially under restricted resources. We evaluate a prototype implementation or our approach. Evaluation results show that our approach can signiﬁcantly reduce the amount of main memory needed for materialization, while maintaining competitive runtimes.
Researcher Affiliation	Academia	Jacopo Urbani Dept. Computer Science VU University Amsterdam Amsterdam, The Netherlands jacopo@cs.vu.nl; Ceriel Jacobs Dept. Computer Science VU University Amsterdam Amsterdam, The Netherlands c.j.h.jacobs@vu.nl; Markus Krötzsch Faculty of Computer Science Technische Universität Dresden Dresden, Germany markus.kroetzsch@tu-dresden.de
Pseudocode	No	The paper describes its procedural steps, particularly for semi-naive evaluation and optimizations. However, it does not include a clearly labeled "Pseudocode" or "Algorithm" block or figure.
Open Source Code	Yes	Our source code and a short tutorial is found at https://github.com/jrbn/vlog.
Open Datasets	Yes	We used largely the same data that was also used to evaluate RDFox (Motik et al. 2014). Datasets and Datalog programs are available online.2 (Footnote 2: http://www.cs.ox.ac.uk/isg/tools/RDFox/2014/AAAI/Data/Rules). The datasets we used are the cultural-heritage ontology Claros (Motik et al. 2014), the DBpedia KG extracted from Wikipedia (Bizer et al. 2009), and two diﬀerently sized graphs generated with the LUBM benchmark (Guo, Pan, and Heﬂin 2005).
Dataset Splits	No	The paper uses various datasets for evaluation but does not specify any training, validation, or test splits (e.g., percentages or sample counts) for these datasets. It refers to them as full datasets used for materialization.
Hardware Specification	Yes	The computer used in all experiments is a Macbook Pro with a 2.2GHz Intel Core i7 processor, 512GB SDD, and 16GB RAM running on Mac OS Yosemite OS v10.10.5.
Software Dependencies	Yes	All software (ours and competitors) was compiled from C++ sources using Apple CLang/LLVM v6.1.0.
Experiment Setup	Yes	In the "Experimental Setup" section, the paper describes system-level settings for VLog, such as "VLog was always used with dynamic optimizations activated but without memoization" and the "timeout (default 1 sec)" for memoization. It also details the specific versions of competitor software used.