Learning Reliable Logical Rules with SATNet
Authors: Zhaoyu Li, Jinpei Guo, Yuhe Jiang, Xujie Si
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate the efficacy of our proposed framework, we conduct a series of experiments on various tasks, including stream transformations (i.e., the parity function, addition, and counting) and Sudoku puzzles. For stream transformations, we strictly verify that our decoded weighted Max SAT formula is functionally equivalent to the human-written logical rules, which ensures that it accurately captures the underlying logic of these tasks. On Sudoku puzzles, we show that using exact solvers on our generated formula could achieve 100% accuracy on the testing set and further verify that it functions equivalently to the ground truth for all unique puzzle inputs. |
| Researcher Affiliation | Academia | 1University of Toronto, 2Vector Institute, 3Mila, 4Shanghai Jiao Tong Univeristy {zhaoyu, six}@cs.toronto.edu |
| Pseudocode | No | The paper describes algorithms and procedures (e.g., IHT algorithm), but does not present them in a structured pseudocode or algorithm block format. |
| Open Source Code | No | The paper does not include a statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | For the 4 4 Sudoku puzzles, we construct a dataset by enumerating all possible puzzles that have a unique solution with minimal inputs, resulting in a total of 85,632 puzzles. To ensure proper evaluation, we split this dataset into training and testing sets with a ratio of 9:1. For the 9 9 Sudoku puzzles, we use the same dataset in SATNet [32], consisting of 9,000 instances for training and 1,000 instances for testing. |
| Dataset Splits | Yes | For all these datasets, we split them into training/testing sets with 90%/10% proportions. |
| Hardware Specification | Yes | All experiments are carried out on a single RTX8000 GPU and 16 CPU cores across 100 epochs, using the Adam optimizer with a learning rate of 2 10 3. |
| Software Dependencies | Yes | Given the generated rules, we utilize Gurobi [13] for exact inference and use the state-of-the-art Max SAT solver CASHWMax SAT [16] for verification. |
| Experiment Setup | Yes | All experiments are carried out on a single RTX8000 GPU and 16 CPU cores across 100 epochs, using the Adam optimizer with a learning rate of 2 10 3. All other hyperparameters are also set the same as in the paper [32]. |