Sound Randomized Smoothing in Floating-Point Arithmetic

Authors: Vaclav Voracek, Matthias Hein

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a simple example where randomized smoothing certifies a radius of 1.26 around a point, even though there is an adversarial example in the distance 0.8 and show how this can be abused to give false certificates for CIFAR10. We discuss the implicit assumptions of randomized smoothing and show that they do not apply to generic image classification models whose smoothed versions are commonly certified. In order to overcome this problem, we propose a sound approach to randomized smoothing when using floating-point precision with essentially equal speed for quantized input. It yields sound certificates for image classifiers which for the ones tested so far are very similar to the unsound practice of randomized smoothing. Table 1: Time comparison of certification times per image of the standard randomized smoothing and the proposed sound procedure with and without reusing noise. Details in Appendix A.1. A EXPERIMENTS To run the experiments, we used the publicly available codebase of Salman et al. (2019) which is distributed under MIT licence. Our modifications will be publicly available under MIT licence. The experiments were run on a single Tesla V100 GPU.
Researcher Affiliation Academia Václav Voráˇcek, Matthias Hein Tübingen AI Center, University of Tübingen
Pseudocode Yes Algorithm 1 Randomized smoothing certification of Cohen et al. (2019) and Algorithm 2 Sound randomized smoothing certification of F gk (both found in Appendix K).
Open Source Code Yes 1Code is available at https://github.com/vvoracek/Sound-Randomized-Smoothing and Our modifications will be publicly available under MIT licence.
Open Datasets Yes We present a simple example where randomized smoothing certifies a radius of 1.26 around a point, even though there is an adversarial example in the distance 0.8 and show how this can be abused to give false certificates for CIFAR10. For every image a in the CIFAR10 test set, we created an image a by increasing the image intensity of a by 1/255 at 512 random positions. Table 1: Time comparison of certification times per image of the standard randomized smoothing and the proposed sound procedure with and without reusing noise. Details in Appendix A.1. For CIFAR10 we used 100 000 random samples and for Image Net 10 000.
Dataset Splits No The paper mentions 'test set' but does not explicitly describe a 'validation set' split or its size/percentage for reproducibility.
Hardware Specification Yes The experiments were run on a single Tesla V100 GPU.
Software Dependencies No The paper mentions 'Sym Py library' and implies 'PyTorch' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes Following the literature, we use 100 000 samples to estimate ˆf(x) and then lower bound this by p for certifying class 1 (resp. upper bound it for class 0) so that the failure probability, that is when p > ˆf(x) (resp. p < ˆf(x)), is at most 0.001. For CIFAR10, The (standard) time per image for 100 000 samples used to evaluate the classifier is 9.82 0.05s. For Imagenet experiments, we used models: pretrained_models/imagenet/replication/resnet50/noise_σ/checkpoint.pth.tar, pretrained_models/imagenet/DNN_2steps/imagenet/eps_512/resnet50/noise_σ/checkpoint.pth.tar where σ {0.25,0.50,1.00} for tables 5 and 6 respectively. Again, we used 10 000 samples to evaluate the smoothed classifier.