Adversarial Neuron Pruning Purifies Backdoored Deep Models
Authors: Dongxian Wu, Yisen Wang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show, even with only an extremely small amount of clean data (e.g., 1%), ANP effectively removes the injected backdoor without causing obvious performance degradation. Our code is available at https://github.com/csdongxian/ANP_backdoor. Extensive experiments demonstrate that ANP consistently provides state-of-the-art defense performance against various backdoor attacks, even using an extremely small amount of clean data. |
| Researcher Affiliation | Academia | 1Dept. of Computer Science and Technology, Tsinghua University, China 2Key Lab. of Machine Perception, School of Artificial Intelligence, Peking Univesity, China 3Institute for Artificial Intelligence, Peking Univesity, China |
| Pseudocode | Yes | Algorithm 1 Adversarial Neuron Pruning (ANP) |
| Open Source Code | Yes | Our code is available at https://github.com/csdongxian/ANP_backdoor. |
| Open Datasets | Yes | We evaluate the performance of all attacks and defenses on CIFAR-10 [19] using Res Net-18 [15] as the base model. |
| Dataset Splits | Yes | We use 90% training data to train the backdoored DNNs and use the all or part of the remaining 10% training data for defense. All defense methods are assumed to have access to the same 1% of clean training data (500 images). |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | For ANP, we optimize all masks using Stochastic Gradient Descent (SGD) with the perturbation budget ϵ = 0.4 and the trade-off coefficient α = 0.2. We set the batch size 128, the constant learning rate 0.2, and the momentum 0.9 for 2000 iterations in total. Typical data augmentation like random crop and horizontal flipping are applied. After optimization, neurons with mask value smaller than 0.2 are pruned. |