On Breaking Deep Generative Model-based Defenses and Beyond

Authors: Yanzhi Chen, Renjie Xie, Zhanxing Zhu

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our attack better breaks state-of-the-art defenses (e.g Defense GAN, ABS) than other attacks (e.g BPDA). Additionally, our empirical results provide insights for understanding the weaknesses of deep generative model defenses. In this section, we apply our attack to re-assess the robustness of recent deep generative model-based defenses. Using our attack, we also discover many interesting properties of these defenses not identified by previous works. Codes are available at https://github.com/cyz-ai/attack DGM. Defenses. We focus on two state-of-the-art deep generative model defenses in the field: Defense GAN (Samangouei et al., 2018) and Analysis-by-Synthesis (Schott et al., 2019). Baselines. We compare our attack with the two attacks mentioned in Section 4: the black-box NES attack, and the white-box BPDA attack. Dataset. We conduct experiments on two common datasets: MNIST and CIFAR10.
Researcher Affiliation Academia Yanzhi Chen 1 Renjie Xie 2 Zhanxing Zhu 3 1School of Informatics, The University of Edinburgh, UK 2School of Information Engineering, Southeast University, China 3School of Mathematical Sciences, Peking University, China.
Pseudocode Yes Algorithm 1 Inversion attack Input: clean image x, generative model G Output: adversarial sample x Hyperparam: learning rate δ, range of λ: [λmin, λmax] Initialize x as described above; repeat for k = 1 to K do determine η for x as in (8) via pilot run; z-step: find the expression z Q(x ) with η ; x-step: with Q, set x proj(x δ x L(x )); end for if attack success then 2(λ + λmin) else 2(λ + λmax) end if until convergence return x
Open Source Code Yes Codes are available at https://github.com/cyz-ai/attack DGM.
Open Datasets Yes Dataset. We conduct experiments on two common datasets: MNIST and CIFAR10.
Dataset Splits No The paper states it uses MNIST and CIFAR10 datasets but does not explicitly provide specific training, validation, or test split percentages or sample counts.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models.
Software Dependencies No The paper does not provide specific software dependency details with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other library versions).
Experiment Setup Yes We perform 75 gradient steps to x in all attacks. For NES we use the default setting as in the original literature (Ilyas et al., 2018). For BPDA attack we use the version adapted to each defense (see Section 5.4.2 in (Athalye et al., 2018) for Defense GAN and Section 5 latent descent attack in (Schott et al., 2019) for Analysis-by-Synthesis). We use a 32-dim latent representation for MNIST and 64-dim latent representation for CIFAR10. In practice, we implement the above backtracking strategy with T = 5 backtracking steps under learning rate η.