Learning Better Representations From Less Data For Propositional Satisfiability
Authors: Mohamed Ghanem, Frederik Schmitt, Julian Siber, Bernd Finkbeiner
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For experiments in this section, we train on 8K unsat formulas in SR(U(10, 40)) and test our models on 10K unseen formulas belonging to the same distribution. (Section 6) Table 1: Performance of all attention variants on unsat SR(U(10, 40)) test problems. (Table 1) Table 3 confirms this main hypothesis. In essence, this result points to the fact that learning signals obtained from training on unsat certificates largely enhance the ability of the neural network to extract useful information from the input formula. (Section 7) |
| Researcher Affiliation | Academia | CISPA Helmholtz Center for Information Security {mohamed.ghanem,frederik.schmitt,julian.siber,finkbeiner}@cispa.de |
| Pseudocode | No | The paper describes methods and architectures but does not include explicit pseudocode blocks or algorithms formatted as such. |
| Open Source Code | Yes | The implementation of our framework can be found at https://github.com/Oschart/Neu Res. |
| Open Datasets | Yes | For our training and testing data, we adopt the same formula generation method as Neuro SAT [43], namely SR(n) where n is the number of variables in the formula... To generate our teacher certificates comprising resolution proofs and truth assignments, we use the Boole Force solver [8] on the formulas generated on the SR distribution. |
| Dataset Splits | No | The paper specifies training and test data amounts and distributions (e.g., 'train on 8K unsat formulas... and test our models on 10K unseen formulas'), but it does not explicitly mention a separate validation split or dataset. |
| Hardware Specification | Yes | This empirically yields better results than using a constant learning rate. We use a time discounting factor γ = 0.99 for the episodic loss. We apply global-norm gradient clipping with a ratio of 0.5 [38]. which took about six days on a single NVIDIA A100 GPU. |
| Software Dependencies | No | The paper mentions 'Adam optimizer [29]' but does not provide specific version numbers for other software dependencies or libraries. |
| Experiment Setup | Yes | Neu Res has several hyperparameters that influence network size, depth, and loss weighting. In the experiments we fix the embedding dimension to 128. We train our models with a batch size of 1 and the Adam optimizer [29] for 50 epochs which took about six days on a single NVIDIA A100 GPU. We linearly anneal the learning rate from 5 10 5 to zero over the training episodes. This empirically yields better results than using a constant learning rate. We use a time discounting factor γ = 0.99 for the episodic loss. We apply global-norm gradient clipping with a ratio of 0.5 [38]. |