reproducibilityindex.ai

Understanding Enthymemes in Deductive Argumentation Using Semantic Distance Measures

Authors: Anthony Hunter5729-5736

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To investigate this, we implemented the SUB algorithm (Algorithm 1) which randomly selects a set of atoms A, a set of clauses , a query , and a substitution S, and then calls UNSAT(( S) { }) which is a call to a SAT solver. If the algorithm returns TRUE, the set ( S) { } is inconsistent and hence the query follows from S, whereas if it returns FALSE, the set ( S) { } is consistent and hence the query does not follow from S. The SUB algorithm is coded in Python, and uses the Py SAT implementation (Ignatiev, Morgado, and Marques-Silva 2018) that incorporates a range of SAT solvers (e.g. Glucose3). Using Algorithm 1, we undertook an empirical investigation on the number of substitutions required to inﬂuence the inferences that follow from a knowledgebase. The results are reported in Figure 1.
Researcher Affiliation	Academia	Anthony Hunter Department of Computer Science, University College London, London, WC1E 6BT, UK anthony.hunter@ucl.ac.uk
Pseudocode	Yes	Algorithm 1: The SUB algorithm where from atoms A, Literals(A) is the set of literals, Clauses(A) is the set of clauses each with 3 literals, and Subs(A,n) is the set of substitutions formed with n incoming symbols. Algorithm 2: The ABDUCE algorithm where . returns the cardinality of the set.
Open Source Code	Yes	See appendix1 for code. 1Appendix: www0.cs.ucl.ac.uk/staff/a.hunter/papers/aaai22.zip
Open Datasets	No	The paper does not explicitly state that the dataset used for its empirical investigation (randomly selected atoms, clauses) is publicly available or open with specific access information. It references GloVe and WordNet as resources but not as the specific dataset used for its own experiments described.
Dataset Splits	No	The paper describes an empirical investigation with randomly selected sets of atoms and clauses. It discusses 'consistency ratio' but does not specify any training, validation, or test dataset splits in terms of percentages, counts, or predefined splits.
Hardware Specification	Yes	As an indication of the performance, for a knowledgebase of 100 clauses, a set of 10 defaults, and a substitution of 10 swaps, with 5 outing symbols, randomly generated from 20 atoms, the ABDUCE algorithm takes 0.22 seconds (average of 50 runs) on a Windows 10 laptop (AMD A10 Radeon R8 with 8GB RAM).
Software Dependencies	No	The paper mentions "Python", "Py SAT implementation (Ignatiev, Morgado, and Marques-Silva 2018)", and "Glucose3" but does not specify version numbers for these software components.
Experiment Setup	Yes	The knowledgebase size is 50 clauses with each substitution having 10 incoming symbols. Each line is for a speciﬁc number of atoms in the language as deﬁned in the legend (top right corner). Consistency ratio is the proportion out of 100 runs where the selection is consistent.