reproducibilityindex.ai

Efficient Certification of Spatial Robustness

Authors: Anian Ruoss, Maximilian Baader, Mislav Balunović, Martin Vechev2504-2513

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on various network architectures and different datasets demonstrate the effectiveness and scalability of our method. We now investigate the precision and scalability of our certiﬁcation method by evaluating it on a rich combination of datasets and network architectures.
Researcher Affiliation	Academia	Anian Ruoss, Maximilian Baader, Mislav Balunovi c, Martin Vechev Department of Computer Science ETH Zurich anruoss@eth.ch {mbaader, mislav.balunovic, martin.vechev}@inf.eth.ch
Pseudocode	No	The paper describes its methods primarily through prose and mathematical equations, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	We make our code publicly available as part of the ERAN framework for neural network veriﬁcation (available at https://github.com/eth-sri/eran).
Open Datasets	Yes	We select a random subset of 100 images from the MNIST (Le Cun, Cortes, and Burges 2010) and CIFAR-10 (Krizhevsky 2009) test datasets on which we run all experiments.
Dataset Splits	No	We select a random subset of 100 images from the MNIST (Le Cun, Cortes, and Burges 2010) and CIFAR-10 (Krizhevsky 2009) test datasets on which we run all experiments. While the paper uses established test datasets, it does not explicitly provide details about the train/validation/test splits used for their models, beyond stating they used a subset of test images for evaluation.
Hardware Specification	Yes	We use a desktop PC with a single Ge Force RTX 2080 Ti GPU and a 16-core Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz...
Software Dependencies	No	The paper mentions software components such as Deep Poly, k-ReLU, MILP, and Re LU activations, but it does not specify their version numbers or other software dependencies with concrete version information.
Experiment Setup	Yes	We select a random subset of 100 images from the MNIST (Le Cun, Cortes, and Burges 2010) and CIFAR-10 (Krizhevsky 2009) test datasets on which we run all experiments. We consider adversarially trained variants of the CONVSMALL, CONVMED, and CONVBIG architectures proposed by Mirman, Gehr, and Vechev (2018), using PGD (Madry et al. 2018) and Diff AI (Mirman, Gehr, and Vechev 2018) for adversarial training. For CIFAR-10, we also consider a Res Net (He et al. 2016), with 4 residual blocks of 16, 16, 32, and 64 ﬁlters each, trained with the provable defense from Wong et al. (2018). We present the model accuracies and training hyperparameters in Appendix B. We limit MILP to 5 minutes...