Adversarial Neuron Pruning Purifies Backdoored Deep Models

Authors: Dongxian Wu, Yisen Wang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show, even with only an extremely small amount of clean data (e.g., 1%), ANP effectively removes the injected backdoor without causing obvious performance degradation. Our code is available at https://github.com/csdongxian/ANP_backdoor. Extensive experiments demonstrate that ANP consistently provides state-of-the-art defense performance against various backdoor attacks, even using an extremely small amount of clean data.
Researcher Affiliation Academia 1Dept. of Computer Science and Technology, Tsinghua University, China 2Key Lab. of Machine Perception, School of Artificial Intelligence, Peking Univesity, China 3Institute for Artificial Intelligence, Peking Univesity, China
Pseudocode Yes Algorithm 1 Adversarial Neuron Pruning (ANP)
Open Source Code Yes Our code is available at https://github.com/csdongxian/ANP_backdoor.
Open Datasets Yes We evaluate the performance of all attacks and defenses on CIFAR-10 [19] using Res Net-18 [15] as the base model.
Dataset Splits Yes We use 90% training data to train the backdoored DNNs and use the all or part of the remaining 10% training data for defense. All defense methods are assumed to have access to the same 1% of clean training data (500 images).
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes For ANP, we optimize all masks using Stochastic Gradient Descent (SGD) with the perturbation budget ϵ = 0.4 and the trade-off coefficient α = 0.2. We set the batch size 128, the constant learning rate 0.2, and the momentum 0.9 for 2000 iterations in total. Typical data augmentation like random crop and horizontal flipping are applied. After optimization, neurons with mask value smaller than 0.2 are pruned.