One-shot Neural Backdoor Erasing via Adversarial Weight Masking
Authors: Shuwen Chai, Jinghui Chen
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct thorough experiments to verify the effectiveness of our proposed AWM method and analyze the sensitivity on hyper-parameters via ablation studies. |
| Researcher Affiliation | Academia | Shuwen Chai Renmin University of China chaishuwen@ruc.edu.cn Jinghui Chen Pennsylvania State University jzc5917@psu.edu |
| Pseudocode | Yes | Algorithm 1 Adversarial Weight Masking (AWM) Input: Infected DNN f with θ, Clean dataset D = {(xi, yi)}n i=1, Batch size b, Learning rate η1, η2, Hyper-parameters α, β, γ, Epochs E, Inner iteration loops T, L1 norm bound τ. 1: Initialize all elements in m as 1 2: for i = 1 to E do 3: Initialize as 0 // Phase 1: Inner Optimization 4: for t = 1 to T do 5: Sample a minibatch (x, y) from D with size b 6: Linner = L(f(x + ; m θ), y) 7: = η1 Linner 8: end for 9: Clip : = min(1, τ 1 ) // Phase 2: Outer Optimization 10: for t = 1 to T do 11: Louter = αL(f(x; m θ), y) + βL(f(x + ; m θ), y) + γ m 1 12: m = m + η2 m Louter 13: Clip m to [0, 1]. 14: end for 15: end for Output: Filter masks m for weights in network f. |
| Open Source Code | Yes | 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | Datasets and Networks. We conduct experiments on two datasets: CIFAR-10 [26] and GTSRB [20]. |
| Dataset Splits | No | CIFAR-10 contains 50000 training data and 10000 test data of 10 classes. While train/test counts are given, a separate *validation* split is not detailed for the main experiments, and the 'available data' for defense acts as a training/validation set for the defense itself. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used for running its experiments (e.g., GPU model, CPU type, memory). |
| Software Dependencies | No | The paper mentions 'Pytorch [39]' but does not provide specific version numbers for software dependencies needed for reproducibility. |
| Experiment Setup | Yes | We test with α [0.5, 0.8], β = 1 α, γ [10 8, 10 5], τ [10, 3000] and shows the performance changes under the Trojan-SQ attack with 500 training data. When varying the value of one specific hyper-parameter, we fix the others to the default value as α0 = 0.9, γ0 = 10 7, τ0 = 1000. |