reproducibilityindex.ai

Fast and Effective Robustness Certification

Authors: Gagandeep Singh, Timon Gehr, Matthew Mirman, Markus Püschel, Martin Vechev

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now evaluate the effectiveness of our new Zonotope transformers for verifying local robustness of neural networks. Our implementation is available as an end-to-end automated veriﬁer, called Deep Z. ... We used the popular MNIST [18] and CIFAR10 [16] datasets for our experiments. ... Fig. 3 shows the percentage of veriﬁed robustness and the average analysis time of all three certiﬁers.
Researcher Affiliation	Academia	Gagandeep Singh, Timon Gehr, Matthew Mirman, Markus Püschel, Martin Vechev Department of Computer Science ETH Zurich, Switzerland {gsingh,timon.gehr,matthew.mirman,pueschel,martin.vechev}@inf.ethz.ch
Pseudocode	No	The paper provides mathematical formulas and descriptions of the methods but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	All of our code, datasets and results are publicly available at http://safeai.ethz.ch/.
Open Datasets	Yes	We used the popular MNIST [18] and CIFAR10 [16] datasets for our experiments.
Dataset Splits	No	The paper mentions using "the popular MNIST [18] and CIFAR10 [16] datasets" and selecting "the ﬁrst 100 images from the test set of each data set" for benchmarks. While standard datasets are used, explicit details about the train/validation/test splits (e.g., percentages or exact counts for each split) are not provided in the main text. Only a specific subset of the test set is highlighted for benchmarking.
Hardware Specification	Yes	All experiments for the FFNNs were carried out on a 3.3 GHz 10 core Intel i9-7900X Skylake CPU with 64 GB main memory; the CNNs and the residual network were evaluated on a 2.6 GHz 14 core Intel Xeon CPU E5-2690 with 512 GB main memory.
Software Dependencies	No	The paper mentions that the veriﬁer is "implemented in Python" and abstract transformers in "C", and are part of the "ELINA [1, 25] library". However, no specific version numbers for Python, C compilers/runtimes, or the ELINA library are provided.
Experiment Setup	Yes	For adversarial training, we used Diff AI [21] and projected gradient descent (PGD) [6] parameterized with ϵ. In our evaluation, we refer to the undefended nets as Point, and to the defended networks with the name of the training procedure (either Diff AI or PGD). More details on our neural networks and the training procedures can be found in the appendix. ... (PGD variant which we refer to as PGD in the graphs) [6] with µ = 1, 22 iterations and two restarts, where the step size is ϵ = 5.5 1 for the ϵ used for training;