reproducibilityindex.ai

Evaluating Epistemic Logic Programs via Answer Set Programming with Quantifiers

Authors: Wolfgang Faber, Michael Morak

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We then experimentally verify that this encoding, together with a QBF solver-based ASP(Q) solver, indeed performs well, compared to current ELP solvers. We perform an experimental evaluation in Section 4. We tested the rewriting approach described in Section 3 for the G94 semantics, by benchmarking it against existing ELP solvers.
Researcher Affiliation	Academia	Wolfgang Faber, Michael Morak University of Klagenfurt, Austria wolfgang.faber@aau.at, michael.morak@aau.at
Pseudocode	No	The paper describes the construction of sub-programs PΦ, P, P1...Pn, P and C, but these are described in natural language and mathematical notation, not in a structured pseudocode or algorithm block.
Open Source Code	No	The paper states: 'We implement a rewriting tool that performs our rewriting...' and refers to it as 'elp2qasp'. However, it does not provide a specific repository link or an explicit statement about the open-source availability of this tool.
Open Datasets	Yes	This is the same benchmark set as used and published by the authors of the selp solver, which they used in the associated conference publication (Bichler, Morak, and Woltran 2020). We brieﬂy describe the benchmark set below. (...) Tree QBFs. The hardness proof of ELP consistency (Shen and Eiter 2016) relies on a reduction from the validity problem for restricted quantiﬁed boolean formulas with three quantiﬁer blocks (i.e. 3-QBFs), which can be generalized to arbitrary 3-QBFs (Bichler, Morak, and Woltran 2020). In that publication, the reduction is applied to the 14 Tree instances of QBFEVAL 16 (Pulina 2016), available at http://www.qbﬂib.org/family\ detail.php?id Family=56.
Dataset Splits	No	The paper states it uses 'three test sets' or 'benchmarks' for evaluation, namely Scholarship Eligibility, Yale Shooting, and Tree QBFs. However, it does not specify any training, validation, or testing dataset splits, percentages, or methodology for these datasets.
Hardware Specification	Yes	Experiments were run on an AMD EPYC 7601 system (2.2GHz base clock speed) with 500 Gi B of memory. Each process was assigned a maximum of 14 GB of RAM, which was never exceeded by any of the solvers tested.
Software Dependencies	Yes	For EPASP, we made trivial modiﬁcations to the python code in order for it to run with clingo 5.4.1. For selp and qasp, the same version of clingo was used. For selp, in addition, we used the htd library, version 1.2.0, and lpopt 2.2. We used qasp 1.1.0 and q asp 0.1.2 as the backend solvers for our ASP(Q) rewriting generated by elp2qasp.
Experiment Setup	Yes	A time limit of 900 seconds was used for each benchmark set. For EPASP, we made trivial modiﬁcations to the python code in order for it to run with clingo 5.4.1. (...) EP-ASP was called with the preprocessing option for brave and cautious consequences on, since it always ran faster this way. The time for selp, qasp, and q asp is the sum of the time it took to run all required components to solve the relevant instance. For selp, clingo was always called with SAT preprocessing enabled, as is recommended by the lpopt tool. (...) For selp, the underlying ASP solver clingo (Gebser et al. 2019) was stopped after ﬁnding the ﬁrst answer set. For EP-ASP, search was terminated after ﬁnding the ﬁrst candidate world view2. Note that to have a fair comparison we disabled the subset-maximality check on the guess that EP-ASP performs by default.