Boosting Randomized Smoothing with Variance Reduced Classifiers
Authors: Miklós Z. Horváth, Mark Niklas Mueller, Marc Fischer, Martin Vechev
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, we show that ensembles of only 3 to 10 classifiers consistently improve on their strongest constituting model with respect to their average certified radius (ACR) by 5% to 21% on both CIFAR10 and Image Net, achieving a new state-of-the-art ACR of 0.86 and 1.11, respectively. We evaluate the proposed methods on the CIFAR10 (Krizhevsky et al., 2009) and Image Net (Russakovsky et al., 2015) datasets with respect to two key metrics: (i) the certified accuracy at predetermined radii and (ii) the average certified radius (ACR). |
| Researcher Affiliation | Academia | Miklós Z. Horváth, Mark Niklas Müller, Marc Fischer, Martin Vechev Department of Computer Science ETH Zurich, Switzerland mihorvat@ethz.ch, {mark.mueller,marc.fischer,martin.vechev}@inf.ethz.ch |
| Pseudocode | Yes | Algorithm 1 Certify from (Cohen et al., 2019) and Algorithm 2 Adaptive Sampling |
| Open Source Code | Yes | We release all code and models required to reproduce our results at https://github.com/eth-sri/smoothing-ensembles. |
| Open Datasets | Yes | We evaluate the proposed methods on the CIFAR10 (Krizhevsky et al., 2009) and Image Net (Russakovsky et al., 2015) datasets with respect to two key metrics: (i) the certified accuracy at predetermined radii and (ii) the average certified radius (ACR). |
| Dataset Splits | Yes | We evaluate every 20th image of the CIFAR10 test set and every 100th of the Image Net test set (500 samples total). Image Net (Russakovsky et al., 2015) contains 1 287 167 training and 50 000 validation images, partitioned into 1000 classes. To rank the single models for CIFAR10, we have evaluated them on a disjunct hold-out portion of the CIFAR10 test set. Concretely, we use the test images with indices 1, 21, 41, ..., 9981 to rank the single models for GAUSSIAN, CONSISTENCY, and SMOOTHADV trained models. |
| Hardware Specification | Yes | We implement our approach in Py Torch (Paszke et al., 2019) and evaluate on CIFAR10 with ensembles of Res Net20 and Res Net110 and on Image Net with ensembles of Res Net50 (He et al., 2016), using 1 and 2 Ge Force RTX 2080 Ti, respectively. CIFAR10 models were trained on single Ge Force RTX 2080 Ti and Image Net models on quadruple 2080 Tis. |
| Software Dependencies | No | The paper mentions Py Torch but does not provide a specific version number. Other software like CUDA are not mentioned with specific versions for key dependencies required for reproducibility. For example: 'We implement our approach in Py Torch (Paszke et al., 2019)...' |
| Experiment Setup | Yes | We use the same training schedule and optimizer for all models, i.e., stochastic gradient descent with Nesterov momentum (weight = 0.9, no dampening), with an ℓ2 weight decay of 0.0001. For CIFAR10, we use a batch size of 256 and an initial learning rate of 0.1, reducing it by a factor of 10 every 50 epochs and training for a total of 150 epochs. For Image Net, we use the same settings, only reducing the total epoch number to 90 and decreasing the learning rate every 30 epochs. |