Reconstructive Neuron Pruning for Backdoor Defense
Authors: Yige Li, Xixiang Lyu, Xingjun Ma, Nodens Koren, Lingjuan Lyu, Bo Li, Yu-Gang Jiang
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results across different datasets and model architectures show that our RNP outperforms the current state-of-the-art method ANP (Wu & Wang, 2021) against 9/12 attacks on CIFAR-10 dataset and 5/5 attacks on an Image Net subset. |
| Researcher Affiliation | Collaboration | 1Xidian University 2Fudan University 3Shanghai Artificial Intelligence Laboratory 4University of Copenhagen 5Sony AI 6University of Illinois at Urbana Champaign. |
| Pseudocode | Yes | Algorithm 1 Reconstructive Neuron Pruning (RNP) Input: A backdoored model fθ( ) with parameter θ, the total number of classes K, defense data Dd, learning rate η, clean accuracy threshold CAmin, dynamic threshold DT in [0, 1] 1: Sample a mini-batch (Xd, Yd) from Dd # Neuron-level unlearning 2: repeat 3: ˆθ max θ L(f (xd, yd; θ)) 4: until fˆθ s clean accuracy CAfˆ θ(Dd) CAmin 5: Backdoor label: yt = arg max K f(xd; ˆθ) # Filter-level recovering 6: mκ = [1]n # initialized to be all ones 7: repeat 8: mκ = mκ η L(f(Xd,Yd; mκ ˆθ)) mκ 9: mκ = clip[0,1](mκ) # 0-1 clipping 10: until training converged # Pruning 11: mκ = I (mκ > DT) # binarization for pruning Output: fmκ θ, yt |
| Open Source Code | Yes | Code is available at https://github.com/bboylyg/RNP. |
| Open Datasets | Yes | We follow the default settings suggested in their original papers and the open-source codes for most attacks, including the trigger pattern, trigger size, and backdoor label. As in previous works (Wu & Wang, 2021; Li et al., 2021b), we evaluate the defense performance against the 12 attacks on the CIFAR-10 dataset and an Image Net-12 dataset. |
| Dataset Splits | Yes | All defenses share limited access to only 500 clean samples as their defense data (for both CIFAR-10 and Image Net-12). |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of 'Py Torch' for implementation and refers to 'open-sourced code' for various baselines (NC, NAD, I-BAU, ANP) but does not provide specific version numbers for these software components or any other libraries. |
| Experiment Setup | Yes | We trained all models for 200 epochs using Stochastic Gradient Descent (SGD) with an initial learning rate of 0.1, a batch size of 128, and a weight decay of 5e-4 to obtain the backdoored models. The learning rate was divided by 10 at the 60th and 120th epochs. [...] For our RNP defense, we maximized the unlearned model fˆθ for 20 epochs until its clean accuracy dropped to 10% (random guess) with a learning rate of 0.01, a batch size of 128, and a weight decay of 5e-2. For the recovering step, we optimized the filter mask for 20 epochs with a learning rate of 0.2. |