Justifying All Differences Using Pseudo-Boolean Reasoning
Authors: Jan Elffers, Stephan Gocht, Ciaran McCreesh, Jakob Nordström1486-1494
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments To test our solver and proof verifier, we generated a number of 25 25 unsatisfiable Sudoku instances. Solving such instances in a reasonable amount of time requires the full capabilities of all-different propagation, and tests all of the functionality of our tools. For a representative instance which required 1,691 guessed decisions to solve, our solver took 41 seconds to prove unsatisfiability, which increased to 42 seconds when logging a proof (we stress that this implementation has not been designed for performance). The proof log contained 109,519 Hall set propagations, and 846 Hall violators, and could be verified in 6 seconds. We also tried deliberately introducing bugs into our solver for example, by failing to find maximum cardinality matchings that required two augmenting steps, and by randomly omitting logging for a small number of Hall sets. In each case the proof verifier caught the mistakes, although only if the instances selected actually triggered the faulty behaviour. |
| Researcher Affiliation | Academia | Jan Elffers,1,2 Stephan Gocht,1,2 Ciaran Mc Creesh,3 Jakob Nordstr om2,4 1Lund University, Lund, Sweden 2University of Copenhagen, Copenhagen, Denmark 3University of Glasgow, Glasgow, Scotland 4KTH Royal Institute of Technology, Stockholm, Sweden |
| Pseudocode | No | The paper describes algorithms and proof rules but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We have implemented a proof checking tool for this proof format.2 It is written in Python, with critical parts in C++ for performance reasons. The tool can also output a log of exactly what it is deriving at every stage of the verification process, which we have found to be tremendously helpful when debugging solvers. Footnote 2: https://github.com/Stephan Gocht/Veri PB/, https://doi.org/10.5281/zenodo.3548582. Finally, we briefly describe the implementation of a small constraint programming solver which can output a justification of all of the choices it makes.3 This solver is implemented in C++, and supports the all different constraint with full GAC propagation, as well as equals, not-equals, and (forward checking) table constraints. Footnote 3: https://github.com/ciaranm/certified-constraint-solver, https://doi.org/10.5281/zenodo.3549712 |
| Open Datasets | No | The paper states: "To test our solver and proof verifier, we generated a number of 25 × 25 unsatisfiable Sudoku instances." However, it does not provide any specific access information (link, DOI, formal citation) for these generated instances, nor does it refer to a well-known public dataset with clear access details. |
| Dataset Splits | No | The paper does not discuss training, validation, or test dataset splits. It mentions using "Sudoku instances" for testing its solver, but not in the context of formal data partitioning for model training and evaluation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU or GPU models, memory specifications) used for running the experiments. It only mentions the programming languages used for the implementation. |
| Software Dependencies | No | The paper states the proof checking tool is "written in Python, with critical parts in C++" and the solver is "implemented in C++". However, it does not provide specific version numbers for Python, C++, or any other libraries or software components used, which are necessary for reproducible software dependencies. |
| Experiment Setup | No | The paper describes the methodology and proof format in detail, including rules like 'p' and 'u', but it does not provide specific experimental setup details such as hyperparameters, training configurations, or system-level settings for the solver's execution beyond the general description of its capabilities. The number of 'guessed decisions' is an outcome of the run, not a setup parameter. |