DL2: Training and Querying Neural Networks with Logic
Authors: Marc Fischer, Mislav Balunovic, Dana Drachsler-Cohen, Timon Gehr, Ce Zhang, Martin Vechev
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present DL2, a system for training and querying neural networks with logical constraints. Using DL2, one can declaratively specify domain knowledge constraints to be enforced during training, as well as pose queries on the model to find inputs that satisfy a set of constraints. DL2 works by translating logical constraints into a loss function with desirable mathematical properties. The loss is then minimized with standard gradientbased methods. We evaluate DL2 by training networks with interesting constraints in unsupervised, semi-supervised and supervised settings. Our experimental evaluation demonstrates that DL2 is more expressive than prior approaches combining logic and neural networks, and its loss functions are better suited for optimization. Further, we show that for a number of queries, DL2 can find the desired inputs in seconds (even for large models such as Res Net-50 on Image Net)." and "6. Experimental Evaluation We now present a thorough experimental evaluation showing the effectiveness of DL2 for querying and training neural networks with logical constraints. |
| Researcher Affiliation | Academia | Marc Fischer 1 Mislav Balunovi c 1 Dana Drachsler-Cohen 1 Timon Gehr 1 Ce Zhang 1 Martin Vechev 1 1Department of Computer Science, ETH Zurich, Switzerland. Correspondence to: Marc Fischer <marc.fischer@inf.ethz.ch>. |
| Pseudocode | Yes | Algorithm 1 Training with constraints. Input: Training set T , network parameters θ, constraint ϕ(x, z, θ) Extract convex set B and constraint ψ from constraint ϕ. for epoch = 1 to nepochs do Sample mini-batch of vectors x T . Using PGD, compute z arg minz B L(ψ)(x, z, θ) for each x. Perform GD update with θL(ϕ)(x, z , θ). end for |
| Open Source Code | Yes | 2https://github.com/eth-sri/dl2 |
| Open Datasets | Yes | We evaluated DL2 on various tasks (supervised, semisupervised and unsupervised learning) across four datasets: MNIST, FASHION (Xiao et al., 2017), CIFAR-10, and CIFAR-100 (Krizhevsky & Hinton, 2009). |
| Dataset Splits | Yes | For semi-supervised learning, we focus on the CIFAR-100 dataset, and split the training set into labeled, unlabeled and validation set in the ratio 20/60/20." and "We generate random connected graphs with 15 vertices and split them into training (300), validation (150) and test (150) sets. |
| Hardware Specification | Yes | Our system is implemented in Py Torch (Paszke et al., 2017) and evaluated on an Nvidia GTX 1080 Ti and Intel Core i7-7700K with 4.20 GHz. |
| Software Dependencies | No | Our system is implemented in Py Torch (Paszke et al., 2017) and evaluated on an Nvidia GTX 1080 Ti and Intel Core i7-7700K with 4.20 GHz. The paper mentions PyTorch but does not specify a version number. |
| Experiment Setup | No | We train a network with the normal cross-entropy loss on the labeled data and the DL2 encoding of the constraint on the unlabeled data (weighted by a parameter λ). For further implementation details, see Appendix A. ... We optimize the loss using Adam (Kingma & Ba, 2015). ... Unlike training, we perform optimization with L-BFGS-B... If a solution is not found after a certain time, we restart L-BFGS-B and reinitialize the variables using MCMC sampling. The paper mentions optimizers and general training aspects (like weighted loss by lambda) but does not provide specific hyperparameter values like learning rate, batch size, or concrete values for lambda or epochs in the main text. |