Fast and Effective Robustness Certification
Authors: Gagandeep Singh, Timon Gehr, Matthew Mirman, Markus Püschel, Martin Vechev
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now evaluate the effectiveness of our new Zonotope transformers for verifying local robustness of neural networks. Our implementation is available as an end-to-end automated verifier, called Deep Z. ... We used the popular MNIST [18] and CIFAR10 [16] datasets for our experiments. ... Fig. 3 shows the percentage of verified robustness and the average analysis time of all three certifiers. |
| Researcher Affiliation | Academia | Gagandeep Singh, Timon Gehr, Matthew Mirman, Markus Püschel, Martin Vechev Department of Computer Science ETH Zurich, Switzerland {gsingh,timon.gehr,matthew.mirman,pueschel,martin.vechev}@inf.ethz.ch |
| Pseudocode | No | The paper provides mathematical formulas and descriptions of the methods but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | All of our code, datasets and results are publicly available at http://safeai.ethz.ch/. |
| Open Datasets | Yes | We used the popular MNIST [18] and CIFAR10 [16] datasets for our experiments. |
| Dataset Splits | No | The paper mentions using "the popular MNIST [18] and CIFAR10 [16] datasets" and selecting "the first 100 images from the test set of each data set" for benchmarks. While standard datasets are used, explicit details about the train/validation/test splits (e.g., percentages or exact counts for each split) are not provided in the main text. Only a specific subset of the test set is highlighted for benchmarking. |
| Hardware Specification | Yes | All experiments for the FFNNs were carried out on a 3.3 GHz 10 core Intel i9-7900X Skylake CPU with 64 GB main memory; the CNNs and the residual network were evaluated on a 2.6 GHz 14 core Intel Xeon CPU E5-2690 with 512 GB main memory. |
| Software Dependencies | No | The paper mentions that the verifier is "implemented in Python" and abstract transformers in "C", and are part of the "ELINA [1, 25] library". However, no specific version numbers for Python, C compilers/runtimes, or the ELINA library are provided. |
| Experiment Setup | Yes | For adversarial training, we used Diff AI [21] and projected gradient descent (PGD) [6] parameterized with ϵ. In our evaluation, we refer to the undefended nets as Point, and to the defended networks with the name of the training procedure (either Diff AI or PGD). More details on our neural networks and the training procedures can be found in the appendix. ... (PGD variant which we refer to as PGD in the graphs) [6] with µ = 1, 22 iterations and two restarts, where the step size is ϵ = 5.5 1 for the ϵ used for training; |