Demystifying Poisoning Backdoor Attacks from a Statistical Perspective

Authors: Ganghua Wang, Xun Xian, Ashish Kundu, Jayanth Srinivasa, Xuan Bi, Mingyi Hong, Jie Ding

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the theory by conducting experiments using benchmark datasets and state-of-the-art backdoor attack scenarios.
Researcher Affiliation Collaboration Ganghua Wang School of Statistics University of Minnesota wang9019@umn.edu Xun Xian Department of ECE University of Minnesota xian0044@umn.edu Jayanth Srinivasa Cisco Research jasriniv@cisco.com Ashish Kundu Cisco Research ashkundu@cisco.com Xuan Bi Carlson School of Management University of Minnesota xbi@umn.edu Mingyi Hong Department of ECE University of Minnesota mhong@umn.edu Jie Ding School of Statistics University of Minnesota dingj@umn.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available here.
Open Datasets Yes We conducted Bad Nets (Gu et al., 2017) on the MNIST (Le Cun et al., 2010) and CIFAR10 (Krizhevsky et al., 2009) datasets, utilizing both Le Net (Le Cun et al., 2015) and Res Net (He et al., 2016) models.
Dataset Splits Yes Following the setting in Theorem 3, we consider two-dimensional Gaussian distributions with m1 = ( 3, 0), m0 = (3, 0), Σ is a diagonal matrix with Σ11 = 3 and Σ22 = 1/2, Pµ(Y = 1) = 0.5, training sample size n = 100, and backdoor data ratio ρ = 0.2. [...] The bandwidth is chosen by the five-fold cross-validation (Ding et al., 2018). We evaluate the model performance on 1000 test inputs using the zero-one loss.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions various models and techniques like "kernel smoothing", "Bad Nets", "Wa Net", "Adaptive Patch", "Adaptive Blend", "DDPM", "Transformer network", "Le Net", and "Res Net", but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup Yes Following the setting in Theorem 3, we consider two-dimensional Gaussian distributions with m1 = ( 3, 0), m0 = (3, 0), Σ is a diagonal matrix with Σ11 = 3 and Σ22 = 1/2, Pµ(Y = 1) = 0.5, training sample size n = 100, and backdoor data ratio ρ = 0.2. The ℓ2-norm of the backdoor trigger η is chosen from {1, 3, 5}, while the degree of angle with m1 m0 is chosen from {0, 45, 90, 135, 180}. [...] In the case of MNIST, the backdoor triggers are 2 by 2 square patches, while for CIFAR-10, 3 by 3 square patches are utilized. All backdoor triggers are positioned at the lower-right corner of the inputs, replacing the original pixels with identical values. The pixel value represents the magnitude of the backdoor trigger and the poisoning ratio is 5%. [...] In this conditional setup, the input space represents class labels, while the output space contains generated images. In the context of the backdoor scenario, a new class labeled 10 was introduced, where the target images were modified MNIST 7 images adding a square patch located in the lower-right corner. [...] We follow the word-level approach in (Chen et al., 2023) to design backdoor attacks, where we create backdoor inputs by inserting a trigger word Brunson , and the target output is a predefined word in the translated language.