Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Proofs and Certificates for Max-SAT
Authors: Matthieu Py, Mohamed Sami Cherif, Djamal Habet
JAIR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Both tools are evaluated on the unweighted and weighted benchmark instances of the 2020 Max-SAT Evaluation. |
| Researcher Affiliation | Academia | Matthieu Py EMAIL Universit e Clermont Auvergne, Mines Saint-Etienne, CNRS, LIMOS, F-63000 Clermont-Ferrand, France Mohamed Sami Cherif EMAIL Djamal Habet EMAIL Aix-Marseille Universit e, Universit e de Toulon, CNRS, LIS, Marseille, France |
| Pseudocode | Yes | Algorithm 1 MS-Builder Require: Formula φ Ensure: Max-SAT certificate c for φ... Algorithm 2 MS-Checker Require: Formula φ, Max-SAT certificate C for φ Ensure: True if C is a valid certificate for φ, False otherwise |
| Open Source Code | Yes | We have implemented MS-Builder and MS-Checker in C++2. the source code is available on https://pageperso.lis-lab.fr/matthieu.py/en/software.html |
| Open Datasets | Yes | Both tools are experimentally evaluated on the unweighted and weighted benchmarks of the 2020 Max-SAT Evaluation (Bacchus et al., 2020). |
| Dataset Splits | No | The paper mentions using 'the benchmark of the unweighted partial track and of the weighted partial track of the 2020 Max-SAT Evaluation', which refers to a set of instances for solver evaluation, not a dataset with traditional training/test/validation splits. Therefore, no such split information is provided. |
| Hardware Specification | Yes | The experiments are performed on Dell Power Edge M620 servers with Intel Xeon Silver E5-2609 processors (clocked at 2.5 2.6 GHz) under Ubuntu 18.04. Each solving process is allocated a slot of 1 hour and at most 16 GB of memory per instance. |
| Software Dependencies | No | Resolution Refutations are computed using Booleforce (Biere, 2010; E en & S orensson, 2003) and Tracecheck (Biere, 2006, 2008). While software tools are named, specific version numbers for Booleforce and Tracecheck are not explicitly provided in the text. |
| Experiment Setup | Yes | Each solving process is allocated a slot of 1 hour and at most 16 GB of memory per instance. |