Defending Neural Backdoors via Generative Distribution Modeling

Authors: Ximing Qiao, Yukun Yang, Hai Li

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on Cifar10/100 dataset demonstrate the effectiveness of MESA in modeling the trigger distribution and the robustness of the proposed defense method.
Researcher Affiliation Academia Ximing Qiao* ECE Department Duke University Durham, NC 27708 ximing.qiao@duke.edu, Yukun Yang* ECE Department Duke University Durham, NC 27708 yukun.yang@duke.edu, Hai Li ECE Department Duke University Durham, NC 27708 hai.li@duke.edu
Pseudocode Yes Algorithm 1: Max-entropy staircase approximator (MESA), Algorithm 2: MESA implementation
Open Source Code Yes Source code of the experiments are available on https://github.com/superrrpotato/ Defending-Neural-Backdoors-via-Generative-Distribution-Modeling.
Open Datasets Yes The experiments are performed on Cifar10 and Cifar100 dataset [9]
Dataset Splits Yes For the 10K testing images from Cifar10, we randomly take 8K for trigger distribution modeling and model retraining, and use the remaining 2K images for the defense evaluation.
Hardware Specification No The paper does not specify the exact hardware used for experiments (e.g., GPU models, CPU types, or memory specifications), only general terms are implied.
Software Dependencies No The paper mentions various software components and models like ResNet-18, GANs, VAEs, MINE, SGD, PCA, and t-SNE but does not provide specific version numbers for any of them.
Experiment Setup Yes The trigger application rule is defined to overwrite an image with the original trigger at a random location. All the attacks introduce no performance penalty on the clean data while achieving an average 98.7% ASR on the 51 original triggers. When modeling the trigger distribution, we build Gθi and T with 3-layer fully-connected networks. Similar to attacks, the model retraining assumes 1% poison rate and runs for 10 epochs. With α = 0.1 and βi = 0.5, 0.8, 0.9, and an ensembled model (The effect of model ensembling is discussed in Appendix B), our defense reliably reduces the ASR of original trigger from above 92% to below 9.1% for all 51 original triggers regardless of choice of βi.