Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Community Structure in Industrial SAT Instances

Authors: Carlos Ansótegui, Maria Luisa Bonet, Jesús Giráldez-Cru, Jordi Levy, Laurent Simon

JAIR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now present an experimental evaluation of the modularity-based preprocessor modprep... In our experimental evaluation, we use all application SAT benchmarks used in the SAT Competition from 2010 to 2017 (both included)... In Table 8, we represent the results of this experiment. In particular, we detail the number of solved instances by each of the four solvers (Glucose, Maple SAT, Lingeling and Mini Sat) with and without using the modularity-based preprocessor modprep... In Fig. 3, we represent the cactus plot (i.e., number of instances solved in a certain wall clock time) of these four solvers with and without using modprep...
Researcher Affiliation	Academia	Carlos Ans otegui EMAIL DIEI, Ud L, Jaume II 69, 25003 Lleida, Spain; Maria Luisa Bonet EMAIL CS, UPC, J. Girona 1 3, 08034 Barcelona, Spain; Jes us Gir aldez-Cru EMAIL Da SCI, DECSAI, Universidad de Granada, 18071 Granada, Spain; Jordi Levy EMAIL IIIA, CSIC, Campus UAB, 08193 Bellaterra, Spain; Laurent Simon EMAIL Univ. Bordeaux, CNRS, Bordeaux INP, La BRI, UMR 5800, F-33400, Talence, France
Pseudocode	Yes	Algorithm 1: Louvain Method (LM) (Blondel et al., 2008), Algorithm 2: Folding function for bipartite graphs, Algorithm 3: Modularity-based SAT Instance Preprocessor (modprep)
Open Source Code	Yes	The software we use in the experimentation is publicly available in https://www.ugr.es/~jgiraldez/.
Open Datasets	Yes	We restrict our analysis to the application benchmarks commonly used in the SAT community, i.e., the benchmarks of the SAT Competitions... To this purpose, we carry out an analysis of all applications benchmarks that have been used in the SAT Competitions between 2010 and 2017, both included. This set contains a total of 2550 industrial SAT instances. http://satcompetition.org/2013/
Dataset Splits	No	The paper uses "all application SAT benchmarks used in the SAT Competition from 2010 to 2017 (both included)" but does not specify how these benchmarks were split into training, validation, or test sets for their experimental methodology. It distinguishes between SAT and UNSAT instances but not in terms of dataset splits for learning or evaluation of the preprocessor itself.
Hardware Specification	No	We performed these experiments with a limit of 64GB RAM, obtaining results for all instances except the 3 formulas of the family software-bmc, which are extremely huge. In this experiment, we limit the RAM memory usage to 16GB... There is no mention of specific CPU or GPU models.
Software Dependencies	No	We evaluate four well-known CDCL SAT solvers. Namely, they are Glucose (Audemard & Simon, 2009), Maple SAT, using its version with LRB (Liang, Ganesh, Poupart, & Czarnecki, 2016), Lingeling (Biere, 2013), and Mini Sat (E en & S orensson, 2003). The paper mentions these solvers and Satelite (E en & Biere, 2005) but does not provide specific version numbers for them.
Experiment Setup	Yes	The experiments were carried out limiting the memory usage to 16GB, and using a timeout of 5000 seconds. The preprocessor modprep is executed with a timeout of 100 seconds. In this experiment, we use Mini Sat, and we stop the solver after 10^3, 10^4 and 10^5 conﬂicts.