Certified Adversarial Robustness via Randomized Smoothing

Authors: Jeremy Cohen, Elan Rosenfeld, Zico Kolter

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We use the technique to train an Image Net classifier with e.g. a certified top-1 accuracy of 49% under adversarial perturbations with ℓ2 norm less than 0.5 (=127/255). Smoothing is the only approach to certifiably robust classification which has been shown feasible on full-resolution Image Net. On smaller-scale datasets where competing approaches to certified ℓ2 robustness are viable, smoothing delivers higher certified accuracies. 4. Experiments
Researcher Affiliation Collaboration Jeremy Cohen 1 Elan Rosenfeld 1 J. Zico Kolter 1 2 1Carnegie Mellon University 2Bosch Center for AI.
Pseudocode Yes Pseudocode for certification and prediction
Open Source Code Yes Code and models are available at http: //github.com/locuslab/smoothing.
Open Datasets Yes We applied randomized smoothing to CIFAR-10 (Krizhevsky, 2009) and Image Net (Deng et al., 2009).
Dataset Splits No The paper mentions using "the full CIFAR-10 test set and a subsample of 500 examples from the Image Net test set" for certification. However, it does not provide explicit details about the training, validation, or test dataset splits (e.g., specific percentages or sample counts for each split).
Hardware Specification Yes On CIFAR-10 our base classifier was a 110-layer residual network; certification each example took 15 seconds on an NVIDIA RTX 2080 Ti.
Software Dependencies No The paper does not explicitly list software dependencies with version numbers (e.g., "PyTorch 1.9", "CUDA 11.1").
Experiment Setup Yes The noise level σ is a hyperparameter of the smoothed classifier g which controls a robustness/accuracy tradeoff; it does not change with the input x. In all experiments, unless otherwise stated, we ran CERTIFY with α = 0.001, so there was at most a 0.1% chance that CERTIFY returned a radius in which g was not truly robust. Unless otherwise stated, when running CERTIFY we used n0 = 100 Monte Carlo samples for selection and n = 100,000 samples for estimation. we follow Lecuyer et al. (2019) and train the base classifier with Gaussian data augmentation at variance σ2.