Learning Better Representations From Less Data For Propositional Satisfiability

Authors: Mohamed Ghanem, Frederik Schmitt, Julian Siber, Bernd Finkbeiner

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For experiments in this section, we train on 8K unsat formulas in SR(U(10, 40)) and test our models on 10K unseen formulas belonging to the same distribution. (Section 6) Table 1: Performance of all attention variants on unsat SR(U(10, 40)) test problems. (Table 1) Table 3 confirms this main hypothesis. In essence, this result points to the fact that learning signals obtained from training on unsat certificates largely enhance the ability of the neural network to extract useful information from the input formula. (Section 7)
Researcher Affiliation Academia CISPA Helmholtz Center for Information Security {mohamed.ghanem,frederik.schmitt,julian.siber,finkbeiner}@cispa.de
Pseudocode No The paper describes methods and architectures but does not include explicit pseudocode blocks or algorithms formatted as such.
Open Source Code Yes The implementation of our framework can be found at https://github.com/Oschart/Neu Res.
Open Datasets Yes For our training and testing data, we adopt the same formula generation method as Neuro SAT [43], namely SR(n) where n is the number of variables in the formula... To generate our teacher certificates comprising resolution proofs and truth assignments, we use the Boole Force solver [8] on the formulas generated on the SR distribution.
Dataset Splits No The paper specifies training and test data amounts and distributions (e.g., 'train on 8K unsat formulas... and test our models on 10K unseen formulas'), but it does not explicitly mention a separate validation split or dataset.
Hardware Specification Yes This empirically yields better results than using a constant learning rate. We use a time discounting factor γ = 0.99 for the episodic loss. We apply global-norm gradient clipping with a ratio of 0.5 [38]. which took about six days on a single NVIDIA A100 GPU.
Software Dependencies No The paper mentions 'Adam optimizer [29]' but does not provide specific version numbers for other software dependencies or libraries.
Experiment Setup Yes Neu Res has several hyperparameters that influence network size, depth, and loss weighting. In the experiments we fix the embedding dimension to 128. We train our models with a batch size of 1 and the Adam optimizer [29] for 50 epochs which took about six days on a single NVIDIA A100 GPU. We linearly anneal the learning rate from 5 10 5 to zero over the training episodes. This empirically yields better results than using a constant learning rate. We use a time discounting factor γ = 0.99 for the episodic loss. We apply global-norm gradient clipping with a ratio of 0.5 [38].