Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks
Authors: Yige Li, Xixiang Lyu, Nodens Koren, Lingjuan Lyu, Bo Li, Xingjun Ma
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show, against 6 state-of-the-art backdoor attacks, NAD can effectively erase the backdoor triggers using only 5% clean training data without causing obvious performance degradation on clean examples. |
| Researcher Affiliation | Collaboration | Yige Li1 Xixiang Lyu1 Nodens Koren2 Lingjuan Lyu3 Bo Li4 Xingjun Ma5 1Xidian University 2The University of Melbourne 3Ant Group 4University of Illinois at Urbana Champaign 5Deakin University, Geelong |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Our code is available at https://github.com/bboylyg/NAD. |
| Open Datasets | Yes | We test the performance of all attacks and erasing methods on two benchmark datasets, CIFAR-10 and GTSRB, with Wide Res Net (WRN-16-1*) being the base model throughout the experiments. |
| Dataset Splits | Yes | We assume all defense methods have access to the same 5% of the clean training data. and stop the finetuning when there are no significant improvements on the validation accuracy within a few epochs (e.g. at epoch 5). |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware used for running the experiments, such as CPU or GPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions software like PyTorch but does not provide specific version numbers for any software dependencies required for reproducibility. |
| Experiment Setup | Yes | For NAD, we finetune the backdoored model (i.e. the student network) on the 5% accessible clean data for 10 epochs (results for 20 epochs can be found in Appendix J) using the Stochastic Gradient Descent (SGD) optimizer with a momentum of 0.9, an initial learning rate of 0.1, and a weight decay factor of 10 4. The learning rate is divided by 10 after every 2 epochs. We use a batch size of 64, and apply typical data augmentation techniques including random crop (padding = 4), horizontal flipping, and Cutout (n holes=1 and length=9) (De Vries & Taylor, 2017). |