reproducibilityindex.ai

Learning Reliable Logical Rules with SATNet

Authors: Zhaoyu Li, Jinpei Guo, Yuhe Jiang, Xujie Si

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the efficacy of our proposed framework, we conduct a series of experiments on various tasks, including stream transformations (i.e., the parity function, addition, and counting) and Sudoku puzzles. For stream transformations, we strictly verify that our decoded weighted Max SAT formula is functionally equivalent to the human-written logical rules, which ensures that it accurately captures the underlying logic of these tasks. On Sudoku puzzles, we show that using exact solvers on our generated formula could achieve 100% accuracy on the testing set and further verify that it functions equivalently to the ground truth for all unique puzzle inputs.
Researcher Affiliation	Academia	1University of Toronto, 2Vector Institute, 3Mila, 4Shanghai Jiao Tong Univeristy {zhaoyu, six}@cs.toronto.edu
Pseudocode	No	The paper describes algorithms and procedures (e.g., IHT algorithm), but does not present them in a structured pseudocode or algorithm block format.
Open Source Code	No	The paper does not include a statement about releasing source code or a link to a code repository.
Open Datasets	Yes	For the 4 4 Sudoku puzzles, we construct a dataset by enumerating all possible puzzles that have a unique solution with minimal inputs, resulting in a total of 85,632 puzzles. To ensure proper evaluation, we split this dataset into training and testing sets with a ratio of 9:1. For the 9 9 Sudoku puzzles, we use the same dataset in SATNet [32], consisting of 9,000 instances for training and 1,000 instances for testing.
Dataset Splits	Yes	For all these datasets, we split them into training/testing sets with 90%/10% proportions.
Hardware Specification	Yes	All experiments are carried out on a single RTX8000 GPU and 16 CPU cores across 100 epochs, using the Adam optimizer with a learning rate of 2 10 3.
Software Dependencies	Yes	Given the generated rules, we utilize Gurobi [13] for exact inference and use the state-of-the-art Max SAT solver CASHWMax SAT [16] for verification.
Experiment Setup	Yes	All experiments are carried out on a single RTX8000 GPU and 16 CPU cores across 100 epochs, using the Adam optimizer with a learning rate of 2 10 3. All other hyperparameters are also set the same as in the paper [32].