Evaluating Robustness of Neural Networks with Mixed Integer Programming

Authors: Vincent Tjeng, Kai Y. Xiao, Russ Tedrake

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On a representative task of finding minimum adversarial distortions, our verifier is two to three orders of magnitude quicker than the state-of-the-art. We achieve this computational speedup via tight formulations for non-linearities, as well as a novel presolve algorithm that makes full use of all information available. The computational speedup allows us to verify properties on convolutional and residual networks with over 100,000 Re LUs several orders of magnitude more than networks previously verified by any complete verifier. In particular, we determine for the first time the exact adversarial accuracy of an MNIST classifier to perturbations with bounded l∞ norm ϵ = 0.1. Across all robust training procedures and network architectures considered, and for both the MNIST and CIFAR-10 datasets, we are able to certify more samples than the state-of-the-art and find more adversarial examples than a strong first-order attack.
Researcher Affiliation Academia Vincent Tjeng, Kai Xiao, Russ Tedrake Massachusetts Institute of Technology {vtjeng, kaix, russt}@mit.edu
Pseudocode Yes Pseudocode demonstrating how to efficiently determine bounds for the tightest possible formulations for the Re LU and maximum function is provided below and in Appendix C respectively.
Open Source Code Yes Our code is available at https://github.com/vtjeng/MIPVerify.jl.
Open Datasets Yes All experiments are carried out on classifiers for the MNIST dataset of handwritten digits or the CIFAR-10 dataset of color images.
Dataset Splits No The paper mentions 'test set' and 'test error' and discusses 'training methods' for networks, but it does not provide explicit details about how the datasets were split into training, validation, and test sets (e.g., percentages or counts for each split).
Hardware Specification Yes All experiments were run on a KVM virtual machine with 8 virtual CPUs running on shared hardware, with Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz processors, and 8GB of RAM.
Software Dependencies Yes We construct the MILP models in Julia (Bezanson et al., 2017) using Ju MP (Dunning et al., 2017), with the model solved by the commercial solver Gurobi 7.5.2 (Gurobi Optimization, 2017).
Experiment Setup Yes PGD attacks were carried out with l∞ norm-bound ϵ = 0.1, 8 steps per sample, and a step size of 0.334. An l1 regularization term was added to the objective with a weight of 0.0015625 on the first convolution layer and 0.003125 for the remaining layers.