Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness?
Authors: Vikash Sehwag, Saeed Mahloujifar, Tinashe Handina, Sihui Dai, Chong Xiang, Mung Chiang, Prateek Mittal
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Next we use proxy distributions to significantly improve the performance of adversarial training on five different datasets. For example, we improve robust accuracy by up to 7.5% and 6.7% in ℓ and ℓ2 threat model over baselines that are not using proxy distributions on the CIFAR-10 dataset. We also improve certified robust accuracy by 7.6% on the CIFAR-10 dataset. |
| Researcher Affiliation | Academia | Princeton University, Caltech, Purdue University |
| Pseudocode | No | The paper describes its methods narratively but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/inspire-group/proxy-distributions. For further reproducibility, we have also submitted our code with the supplementary material. |
| Open Datasets | Yes | We consider five datasets, namely CIFAR-10 (Krizhevsky et al., 2014), CIFAR100 (Krizhevsky et al., 2014), Celeb A (Liu et al., 2015), AFHQ (Choi et al., 2020), and Image Net (Deng et al., 2009). |
| Dataset Splits | Yes | We consider five datasets, namely CIFAR-10 (Krizhevsky et al., 2014), CIFAR100 (Krizhevsky et al., 2014), Celeb A (Liu et al., 2015), AFHQ (Choi et al., 2020), and Image Net (Deng et al., 2009). We keep 10, 000 synthetic images from this set for validation and train on the rest of them. |
| Hardware Specification | Yes | Using an RTX 4x2080Ti GPU cluster, it takes 23.8 hours to sample one million images on the CIFAR-10 dataset. |
| Software Dependencies | No | The paper describes the use of various models and tools (e.g., 'Res Net-18', 'Auto Attack'), but it does not specify any software libraries or frameworks with version numbers (e.g., 'PyTorch 1.9' or 'Python 3.8'). |
| Experiment Setup | Yes | We train each network using stochastic gradient descent and 0.1 learning rate with cosine learning rate decay, weight decay of 5 10 4, batch size 128, and 200 epochs. We use γ = 0.4 as it achieves best results (Appendix D). We combine real and synthetic images in a 1:1 ratio in each batch |