Efficient Certification of Spatial Robustness
Authors: Anian Ruoss, Maximilian Baader, Mislav Balunović, Martin Vechev2504-2513
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on various network architectures and different datasets demonstrate the effectiveness and scalability of our method. We now investigate the precision and scalability of our certification method by evaluating it on a rich combination of datasets and network architectures. |
| Researcher Affiliation | Academia | Anian Ruoss, Maximilian Baader, Mislav Balunovi c, Martin Vechev Department of Computer Science ETH Zurich anruoss@eth.ch {mbaader, mislav.balunovic, martin.vechev}@inf.eth.ch |
| Pseudocode | No | The paper describes its methods primarily through prose and mathematical equations, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We make our code publicly available as part of the ERAN framework for neural network verification (available at https://github.com/eth-sri/eran). |
| Open Datasets | Yes | We select a random subset of 100 images from the MNIST (Le Cun, Cortes, and Burges 2010) and CIFAR-10 (Krizhevsky 2009) test datasets on which we run all experiments. |
| Dataset Splits | No | We select a random subset of 100 images from the MNIST (Le Cun, Cortes, and Burges 2010) and CIFAR-10 (Krizhevsky 2009) test datasets on which we run all experiments. While the paper uses established test datasets, it does not explicitly provide details about the train/validation/test splits used for their models, beyond stating they used a subset of test images for evaluation. |
| Hardware Specification | Yes | We use a desktop PC with a single Ge Force RTX 2080 Ti GPU and a 16-core Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz... |
| Software Dependencies | No | The paper mentions software components such as Deep Poly, k-ReLU, MILP, and Re LU activations, but it does not specify their version numbers or other software dependencies with concrete version information. |
| Experiment Setup | Yes | We select a random subset of 100 images from the MNIST (Le Cun, Cortes, and Burges 2010) and CIFAR-10 (Krizhevsky 2009) test datasets on which we run all experiments. We consider adversarially trained variants of the CONVSMALL, CONVMED, and CONVBIG architectures proposed by Mirman, Gehr, and Vechev (2018), using PGD (Madry et al. 2018) and Diff AI (Mirman, Gehr, and Vechev 2018) for adversarial training. For CIFAR-10, we also consider a Res Net (He et al. 2016), with 4 residual blocks of 16, 16, 32, and 64 filters each, trained with the provable defense from Wong et al. (2018). We present the model accuracies and training hyperparameters in Appendix B. We limit MILP to 5 minutes... |