DL2: Training and Querying Neural Networks with Logic

Authors: Marc Fischer, Mislav Balunovic, Dana Drachsler-Cohen, Timon Gehr, Ce Zhang, Martin Vechev

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present DL2, a system for training and querying neural networks with logical constraints. Using DL2, one can declaratively specify domain knowledge constraints to be enforced during training, as well as pose queries on the model to find inputs that satisfy a set of constraints. DL2 works by translating logical constraints into a loss function with desirable mathematical properties. The loss is then minimized with standard gradientbased methods. We evaluate DL2 by training networks with interesting constraints in unsupervised, semi-supervised and supervised settings. Our experimental evaluation demonstrates that DL2 is more expressive than prior approaches combining logic and neural networks, and its loss functions are better suited for optimization. Further, we show that for a number of queries, DL2 can find the desired inputs in seconds (even for large models such as Res Net-50 on Image Net)." and "6. Experimental Evaluation We now present a thorough experimental evaluation showing the effectiveness of DL2 for querying and training neural networks with logical constraints.
Researcher Affiliation Academia Marc Fischer 1 Mislav Balunovi c 1 Dana Drachsler-Cohen 1 Timon Gehr 1 Ce Zhang 1 Martin Vechev 1 1Department of Computer Science, ETH Zurich, Switzerland. Correspondence to: Marc Fischer <marc.fischer@inf.ethz.ch>.
Pseudocode Yes Algorithm 1 Training with constraints. Input: Training set T , network parameters θ, constraint ϕ(x, z, θ) Extract convex set B and constraint ψ from constraint ϕ. for epoch = 1 to nepochs do Sample mini-batch of vectors x T . Using PGD, compute z arg minz B L(ψ)(x, z, θ) for each x. Perform GD update with θL(ϕ)(x, z , θ). end for
Open Source Code Yes 2https://github.com/eth-sri/dl2
Open Datasets Yes We evaluated DL2 on various tasks (supervised, semisupervised and unsupervised learning) across four datasets: MNIST, FASHION (Xiao et al., 2017), CIFAR-10, and CIFAR-100 (Krizhevsky & Hinton, 2009).
Dataset Splits Yes For semi-supervised learning, we focus on the CIFAR-100 dataset, and split the training set into labeled, unlabeled and validation set in the ratio 20/60/20." and "We generate random connected graphs with 15 vertices and split them into training (300), validation (150) and test (150) sets.
Hardware Specification Yes Our system is implemented in Py Torch (Paszke et al., 2017) and evaluated on an Nvidia GTX 1080 Ti and Intel Core i7-7700K with 4.20 GHz.
Software Dependencies No Our system is implemented in Py Torch (Paszke et al., 2017) and evaluated on an Nvidia GTX 1080 Ti and Intel Core i7-7700K with 4.20 GHz. The paper mentions PyTorch but does not specify a version number.
Experiment Setup No We train a network with the normal cross-entropy loss on the labeled data and the DL2 encoding of the constraint on the unlabeled data (weighted by a parameter λ). For further implementation details, see Appendix A. ... We optimize the loss using Adam (Kingma & Ba, 2015). ... Unlike training, we perform optimization with L-BFGS-B... If a solution is not found after a certain time, we restart L-BFGS-B and reinitialize the variables using MCMC sampling. The paper mentions optimizers and general training aspects (like weighted loss by lambda) but does not provide specific hyperparameter values like learning rate, batch size, or concrete values for lambda or epochs in the main text.