Defending Neural Backdoors via Generative Distribution Modeling
Authors: Ximing Qiao, Yukun Yang, Hai Li
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on Cifar10/100 dataset demonstrate the effectiveness of MESA in modeling the trigger distribution and the robustness of the proposed defense method. |
| Researcher Affiliation | Academia | Ximing Qiao* ECE Department Duke University Durham, NC 27708 ximing.qiao@duke.edu, Yukun Yang* ECE Department Duke University Durham, NC 27708 yukun.yang@duke.edu, Hai Li ECE Department Duke University Durham, NC 27708 hai.li@duke.edu |
| Pseudocode | Yes | Algorithm 1: Max-entropy staircase approximator (MESA), Algorithm 2: MESA implementation |
| Open Source Code | Yes | Source code of the experiments are available on https://github.com/superrrpotato/ Defending-Neural-Backdoors-via-Generative-Distribution-Modeling. |
| Open Datasets | Yes | The experiments are performed on Cifar10 and Cifar100 dataset [9] |
| Dataset Splits | Yes | For the 10K testing images from Cifar10, we randomly take 8K for trigger distribution modeling and model retraining, and use the remaining 2K images for the defense evaluation. |
| Hardware Specification | No | The paper does not specify the exact hardware used for experiments (e.g., GPU models, CPU types, or memory specifications), only general terms are implied. |
| Software Dependencies | No | The paper mentions various software components and models like ResNet-18, GANs, VAEs, MINE, SGD, PCA, and t-SNE but does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | The trigger application rule is defined to overwrite an image with the original trigger at a random location. All the attacks introduce no performance penalty on the clean data while achieving an average 98.7% ASR on the 51 original triggers. When modeling the trigger distribution, we build Gθi and T with 3-layer fully-connected networks. Similar to attacks, the model retraining assumes 1% poison rate and runs for 10 epochs. With α = 0.1 and βi = 0.5, 0.8, 0.9, and an ensembled model (The effect of model ensembling is discussed in Appendix B), our defense reliably reduces the ASR of original trigger from above 92% to below 9.1% for all 51 original triggers regardless of choice of βi. |