Provably Robust Adversarial Examples

Authors: Dimitar Iliev Dimitrov, Gagandeep Singh, Timon Gehr, Martin Vechev

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental evaluation shows the effectiveness of PARADE: it successfully finds large provably robust regions including ones containing 10573 adversarial examples for pixel intensity and 10599 for geometric perturbations. The provability enables our robust examples to be significantly more effective against state-of-the-art defenses based on randomized smoothing than the individual attacks used to construct the regions. We implemented PARADE in Python and used TensorFlow (Abadi et al., 2015) for generating PGD attacks. We use Gurobi 9.0 (Gurobi Optimization, LLC, 2020) for solving the LP instances. We rely on ERAN (Singh et al., 2018b) for its Deep Poly and Deep G implementations.
Researcher Affiliation Collaboration Dimitar I. Dimitrov1 Gagandeep Singh2,3 Timon Gehr1 Martin Vechev1 1 ETH Zurich 2 University of Illinois, Urbana Champaign 3 VMware Research
Pseudocode Yes Algorithm 1 GENERATE_UNDERAPPROX; Algorithm 2 PGD_PROJECT; Algorithm 3 GEN_POLY; Algorithm 4 GEN_OBJ_PLANES; Algorithm 5 GEN_BOUND_PLANES; Algorithm 6 ADJUST_BIAS
Open Source Code Yes We make the code of PARADE available at https://github.com/eth-sri/parade.git
Open Datasets Yes Neural Networks. We use MNIST (Le Cun et al., 1998) and CIFAR10 (Krizhevsky, 2009) based neural networks.
Dataset Splits No The paper does not explicitly provide training/validation/test dataset splits. It mentions using 'the first 100 test images' for evaluation and discusses training networks, but does not detail how the overall dataset was partitioned for training and validation.
Hardware Specification Yes We ran all our experiments on a 2.8 GHz 16 core Intel(R) Xeon(R) Gold 6242 processor with 64 GB RAM.
Software Dependencies Yes We implemented PARADE in Python and used TensorFlow (Abadi et al., 2015) for generating PGD attacks. We use Gurobi 9.0 (Gurobi Optimization, LLC, 2020) for solving the LP instances. We rely on ERAN (Singh et al., 2018b) for its Deep Poly and Deep G implementations.
Experiment Setup Yes Additional details about the experimental setup are given in Appendix C.2. ... For all experiments in Table 2, we compute O using random sampling attacks... We use c = 0.65 for all experiments... For the MNIST experiments in Table 2, we execute SHRINK_PGD with 50 initialization and 200 PGD steps of size 5e-5. For CIFAR10 experiments, we use 500 initialization and 50 PGD steps instead.