The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing
Authors: Blaise Delattre, Alexandre Araujo, Quentin Barthélemy, Alexandre Allauzen
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show a significant improvement in certified accuracy compared to current state-of-the-art methods. Our novel certification procedure allows us to use pre-trained models with randomized smoothing, effectively improving the current certification radius in a zero-shot manner. |
| Researcher Affiliation | Collaboration | 1 Foxstream, Vaulx-en-Velin, France 2 Miles Team, LAMSADE, Universit e Paris-Dauphine, PSL University, Paris, France 3 ECE, New York University, NY, USA 4 ESPCI PSL, Paris, France |
| Pseudocode | Yes | Algorithm 1 generalized sparsemax(z, r) and Algorithm 2 LVM-RS (f, σ, x, n0, n) |
| Open Source Code | No | The paper does not explicitly state that its source code is available or provide a link to a code repository. |
| Open Datasets | Yes | This procedure demonstrates state-of-the-art results on the CIFAR-10 and Image Net datasets, see Section 3.5. |
| Dataset Splits | No | The paper mentions 'n0' as a validation set size for Monte Carlo sampling within their method, but it does not provide specific train/validation/test split percentages or absolute counts for the CIFAR-10 or ImageNet datasets themselves, nor does it cite predefined splits for these datasets. |
| Hardware Specification | Yes | Computation was performed on GPU V100 |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used in their experiments. |
| Experiment Setup | Yes | We use 50 temperatures ranging from tlower = 0.01, tupper = 50, and simplex maps S = {sparsemax, softmax, hardmax}. The baseline consists of the state-of-the-art top performative model of Carlini et al. (2023) which does smoothing of hardmax of base classifier and uses the Pearson-Clopper confidence interval to control the risk α. To compare the baseline with our method, certified accuracies are computed with R2 in the function of the level of perturbations ε, for different noise levels σ = {0.25, 0.5, 1}. Results are presented in Figure 5 for CIFAR-10 and in Figure 6 for Image Net. We use 10^5 samples for CIFAR-10 and 10^4 samples for Image Net (as per Table 2 and Table 3 captions). |