One-shot Neural Backdoor Erasing via Adversarial Weight Masking

Authors: Shuwen Chai, Jinghui Chen

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct thorough experiments to verify the effectiveness of our proposed AWM method and analyze the sensitivity on hyper-parameters via ablation studies.
Researcher Affiliation Academia Shuwen Chai Renmin University of China chaishuwen@ruc.edu.cn Jinghui Chen Pennsylvania State University jzc5917@psu.edu
Pseudocode Yes Algorithm 1 Adversarial Weight Masking (AWM) Input: Infected DNN f with θ, Clean dataset D = {(xi, yi)}n i=1, Batch size b, Learning rate η1, η2, Hyper-parameters α, β, γ, Epochs E, Inner iteration loops T, L1 norm bound τ. 1: Initialize all elements in m as 1 2: for i = 1 to E do 3: Initialize as 0 // Phase 1: Inner Optimization 4: for t = 1 to T do 5: Sample a minibatch (x, y) from D with size b 6: Linner = L(f(x + ; m θ), y) 7: = η1 Linner 8: end for 9: Clip : = min(1, τ 1 ) // Phase 2: Outer Optimization 10: for t = 1 to T do 11: Louter = αL(f(x; m θ), y) + βL(f(x + ; m θ), y) + γ m 1 12: m = m + η2 m Louter 13: Clip m to [0, 1]. 14: end for 15: end for Output: Filter masks m for weights in network f.
Open Source Code Yes 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets Yes Datasets and Networks. We conduct experiments on two datasets: CIFAR-10 [26] and GTSRB [20].
Dataset Splits No CIFAR-10 contains 50000 training data and 10000 test data of 10 classes. While train/test counts are given, a separate *validation* split is not detailed for the main experiments, and the 'available data' for defense acts as a training/validation set for the defense itself.
Hardware Specification No The paper does not explicitly describe the specific hardware used for running its experiments (e.g., GPU model, CPU type, memory).
Software Dependencies No The paper mentions 'Pytorch [39]' but does not provide specific version numbers for software dependencies needed for reproducibility.
Experiment Setup Yes We test with α [0.5, 0.8], β = 1 α, γ [10 8, 10 5], τ [10, 3000] and shows the performance changes under the Trojan-SQ attack with 500 training data. When varying the value of one specific hyper-parameter, we fix the others to the default value as α0 = 0.9, γ0 = 10 7, τ0 = 1000.