reproducibilityindex.ai

Certified Adversarial Robustness via Randomized Smoothing

Authors: Jeremy Cohen, Elan Rosenfeld, Zico Kolter

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We use the technique to train an Image Net classiﬁer with e.g. a certiﬁed top-1 accuracy of 49% under adversarial perturbations with ℓ2 norm less than 0.5 (=127/255). Smoothing is the only approach to certiﬁably robust classiﬁcation which has been shown feasible on full-resolution Image Net. On smaller-scale datasets where competing approaches to certiﬁed ℓ2 robustness are viable, smoothing delivers higher certiﬁed accuracies. 4. Experiments
Researcher Affiliation	Collaboration	Jeremy Cohen 1 Elan Rosenfeld 1 J. Zico Kolter 1 2 1Carnegie Mellon University 2Bosch Center for AI.
Pseudocode	Yes	Pseudocode for certiﬁcation and prediction
Open Source Code	Yes	Code and models are available at http: //github.com/locuslab/smoothing.
Open Datasets	Yes	We applied randomized smoothing to CIFAR-10 (Krizhevsky, 2009) and Image Net (Deng et al., 2009).
Dataset Splits	No	The paper mentions using "the full CIFAR-10 test set and a subsample of 500 examples from the Image Net test set" for certification. However, it does not provide explicit details about the training, validation, or test dataset splits (e.g., specific percentages or sample counts for each split).
Hardware Specification	Yes	On CIFAR-10 our base classiﬁer was a 110-layer residual network; certiﬁcation each example took 15 seconds on an NVIDIA RTX 2080 Ti.
Software Dependencies	No	The paper does not explicitly list software dependencies with version numbers (e.g., "PyTorch 1.9", "CUDA 11.1").
Experiment Setup	Yes	The noise level σ is a hyperparameter of the smoothed classiﬁer g which controls a robustness/accuracy tradeoff; it does not change with the input x. In all experiments, unless otherwise stated, we ran CERTIFY with α = 0.001, so there was at most a 0.1% chance that CERTIFY returned a radius in which g was not truly robust. Unless otherwise stated, when running CERTIFY we used n0 = 100 Monte Carlo samples for selection and n = 100,000 samples for estimation. we follow Lecuyer et al. (2019) and train the base classiﬁer with Gaussian data augmentation at variance σ2.