Evaluating Robustness of Neural Networks with Mixed Integer Programming
Authors: Vincent Tjeng, Kai Y. Xiao, Russ Tedrake
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On a representative task of finding minimum adversarial distortions, our verifier is two to three orders of magnitude quicker than the state-of-the-art. We achieve this computational speedup via tight formulations for non-linearities, as well as a novel presolve algorithm that makes full use of all information available. The computational speedup allows us to verify properties on convolutional and residual networks with over 100,000 Re LUs several orders of magnitude more than networks previously verified by any complete verifier. In particular, we determine for the first time the exact adversarial accuracy of an MNIST classifier to perturbations with bounded l∞ norm ϵ = 0.1. Across all robust training procedures and network architectures considered, and for both the MNIST and CIFAR-10 datasets, we are able to certify more samples than the state-of-the-art and find more adversarial examples than a strong first-order attack. |
| Researcher Affiliation | Academia | Vincent Tjeng, Kai Xiao, Russ Tedrake Massachusetts Institute of Technology {vtjeng, kaix, russt}@mit.edu |
| Pseudocode | Yes | Pseudocode demonstrating how to efficiently determine bounds for the tightest possible formulations for the Re LU and maximum function is provided below and in Appendix C respectively. |
| Open Source Code | Yes | Our code is available at https://github.com/vtjeng/MIPVerify.jl. |
| Open Datasets | Yes | All experiments are carried out on classifiers for the MNIST dataset of handwritten digits or the CIFAR-10 dataset of color images. |
| Dataset Splits | No | The paper mentions 'test set' and 'test error' and discusses 'training methods' for networks, but it does not provide explicit details about how the datasets were split into training, validation, and test sets (e.g., percentages or counts for each split). |
| Hardware Specification | Yes | All experiments were run on a KVM virtual machine with 8 virtual CPUs running on shared hardware, with Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz processors, and 8GB of RAM. |
| Software Dependencies | Yes | We construct the MILP models in Julia (Bezanson et al., 2017) using Ju MP (Dunning et al., 2017), with the model solved by the commercial solver Gurobi 7.5.2 (Gurobi Optimization, 2017). |
| Experiment Setup | Yes | PGD attacks were carried out with l∞ norm-bound ϵ = 0.1, 8 steps per sample, and a step size of 0.334. An l1 regularization term was added to the objective with a weight of 0.0015625 on the first convolution layer and 0.003125 for the remaining layers. |