Fast and Effective Robustness Certification

Authors: Gagandeep Singh, Timon Gehr, Matthew Mirman, Markus Püschel, Martin Vechev

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now evaluate the effectiveness of our new Zonotope transformers for verifying local robustness of neural networks. Our implementation is available as an end-to-end automated verifier, called Deep Z. ... We used the popular MNIST [18] and CIFAR10 [16] datasets for our experiments. ... Fig. 3 shows the percentage of verified robustness and the average analysis time of all three certifiers.
Researcher Affiliation Academia Gagandeep Singh, Timon Gehr, Matthew Mirman, Markus Püschel, Martin Vechev Department of Computer Science ETH Zurich, Switzerland {gsingh,timon.gehr,matthew.mirman,pueschel,martin.vechev}@inf.ethz.ch
Pseudocode No The paper provides mathematical formulas and descriptions of the methods but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes All of our code, datasets and results are publicly available at http://safeai.ethz.ch/.
Open Datasets Yes We used the popular MNIST [18] and CIFAR10 [16] datasets for our experiments.
Dataset Splits No The paper mentions using "the popular MNIST [18] and CIFAR10 [16] datasets" and selecting "the first 100 images from the test set of each data set" for benchmarks. While standard datasets are used, explicit details about the train/validation/test splits (e.g., percentages or exact counts for each split) are not provided in the main text. Only a specific subset of the test set is highlighted for benchmarking.
Hardware Specification Yes All experiments for the FFNNs were carried out on a 3.3 GHz 10 core Intel i9-7900X Skylake CPU with 64 GB main memory; the CNNs and the residual network were evaluated on a 2.6 GHz 14 core Intel Xeon CPU E5-2690 with 512 GB main memory.
Software Dependencies No The paper mentions that the verifier is "implemented in Python" and abstract transformers in "C", and are part of the "ELINA [1, 25] library". However, no specific version numbers for Python, C compilers/runtimes, or the ELINA library are provided.
Experiment Setup Yes For adversarial training, we used Diff AI [21] and projected gradient descent (PGD) [6] parameterized with ϵ. In our evaluation, we refer to the undefended nets as Point, and to the defended networks with the name of the training procedure (either Diff AI or PGD). More details on our neural networks and the training procedures can be found in the appendix. ... (PGD variant which we refer to as PGD in the graphs) [6] with µ = 1, 22 iterations and two restarts, where the step size is ϵ = 5.5 1 for the ϵ used for training;