Grounding Neural Inference with Satisfiability Modulo Theories
Authors: Zifan Wang, Saranya Vijayakumar, Kaiji Lu, Vijay Ganesh, Somesh Jha, Matt Fredrikson
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical results show that it leads to models that 1) require fewer training samples than conventional models, 2) that have consistent performance against certain types of covariate shift, and 3) that ultimately learn representations that are consistent with symbolic knowledge, and thus naturally interpretable. In this section we present an empirical evaluation of SMTLayer on four learning problems that can be decomposed into perceptual and symbolic subtasks. |
| Researcher Affiliation | Collaboration | Zifan Wang Center for AI Safety, Saranya Vijayakumar Carnegie Mellon University, Kaiji Lu Pinterest Inc., Vijay Ganesh Georgia Institute of Technology, Somesh Jha University of Wisconsin-Madison, Matt Fredriskon Carnegie Mellon University |
| Pseudocode | Yes | Algorithm 1: Fφ smt(z) Forward Pass and Algorithm 2: Bφ core(z, y, @y (y, y?)) unsat core-based backward pass |
| Open Source Code | Yes | Our code is available at https://github.com/cmu-transparency/smt-layer |
| Open Datasets | Yes | MNIST Addition. The MNIST addition problem is illustrated in Figure 1, and is similar to the benchmark described by [13]. For training, we use MNIST +p%... Visual Algebra. The task is to solve for the variable x in a graphical depiction of the equation ax + b = c, where a, b and c are randomly-chosen numbers, and each symbol is depicted visually using EMNIST [5] and HASY graphics [29]. Visual Sudoku. This task is to complete a 9 × 9 Sudoku board where each entry is an MNIST digit. We use the dataset from the SATNet evaluation [35]. |
| Dataset Splits | No | No explicit details on validation dataset splits (e.g., percentages, sample counts, or specific methodology for creating a validation set) are provided. The paper discusses training and test sets but not validation. |
| Hardware Specification | No | The paper mentions that the SMT solver must be run on CPU cores and it may be possible to move some functionality to GPU cores, but it does not provide specific details such as CPU/GPU models, memory, or other hardware specifications used for their experiments. |
| Software Dependencies | No | We implemented a prototype of our approach using Pytorch [22] and Z3 [7]. (No version numbers are provided for PyTorch or Z3, which are key software dependencies). |
| Experiment Setup | Yes | Hyper-parameters used for training are in Appendix A.5. When training models with SMTLayer, we use SGD with Nesterov momentum at rate 0.9 and gradient clipping rate 0.1. Before training a model with SMTLayer (or a comparison technique, unless stated otherwise), we first pre-train the neural network by replacing SMTLayer with a dense network containing one hidden layer of 512 neurons. |