Effects of Exponential Gaussian Distribution on (Double Sampling) Randomized Smoothing

Authors: Youwei Shu, Xi Xiao, Derui Wang, Yuxin Cao, Siji Chen, Jason Xue, Linyi Li, Bo Li

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on real-world datasets confirm our theoretical analysis of the ESG distributions, that they provide almost the same certification under different exponents η for both RS and DSRS.
Researcher Affiliation Collaboration Youwei Shu 1 Xi Xiao 1 Derui Wang 2 Yuxin Cao 1 Siji Chen 1 Minhui Xue 2 Linyi Li 3 4 Bo Li 3 5 Shenzhen International Graduate School, Tsinghua University 2CSIRO s Data61 3University of Illinois Urbana-Champaign 4Simon Fraser University 5University of Chicago.
Pseudocode Yes Algorithm 1: Algorithm for finding tight µ for the Ω( d) lower bound
Open Source Code Yes Our code is available at https://gi thub.com/tdano1/eg-on-smoothing.
Open Datasets Yes All base classifiers used in this work are trained by CIFAR-10 (Krizhevsky et al., 2009) or Image Net (Russakovsky et al., 2015), taking EGG with η = 2 as the noise distribution.
Dataset Splits No The paper mentions using a "test dataset" but does not provide explicit details on train/validation/test splits by percentages or sample counts for reproducibility. It primarily details sampling numbers for noise distributions for certification, rather than data partitioning for training.
Hardware Specification Yes All of our experiments on real-world datasets are composed of sampling and certification, which are finished with 4 NVIDIA RTX 3080 GPUs and CPUs.
Software Dependencies No The scipy package loses precision when calculating integrals for the Γ(a, 1) distribution with large parameters (say, a > 500 ) on infinite intervals. To solve this problem, we implement a Linear Numerical Integration (LNI) method to compute the expectations fast and accurately based on Lemma 5.6.
Experiment Setup Yes The sampling number N is set to 100000, with the significance level α = 0.001. In the double-sampling process, we set k = 1530 and k = 75260 for CIFAR-10 and Image Net, respectively, in consistent with base classifiers. The sampling numbers N1, N2 are 50000, and the significance levels α1, α2 are 0.0005 for Monte Carlo sampling, equal for P and Q. The error bound e for certified radius is set at 1 10 6. For all the ESG experiments, we set the number of segments to 256, and ι = 10 4.