Shared Adversarial Unlearning: Backdoor Mitigation by Unlearning Shared Adversarial Examples

Authors: Shaokui Wei, Mingda Zhang, Hongyuan Zha, Baoyuan Wu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on various benchmark datasets and network architectures show that our proposed method achieves state-of-the-art performance for backdoor defense.
Researcher Affiliation Academia 1School of Data Science, The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), China 2Shenzhen Institute of Artificial Intelligence and Robotics for Society, China
Pseudocode Yes Algorithm 1 Shared Adversarial Unlearning
Open Source Code Yes The code is available at https://github.com/SCLBD/Backdoor Bench (Py Torch) and https://github.com/shawkui/Mind Trojan (Mind Spore).
Open Datasets Yes We evaluate all the attacks on 3 benchmark datasets, CIFAR-10 [24], Tiny Image Net [26], and GTSRB [43]
Dataset Splits Yes By default, all the defense methods can access 5% benign training data. We evaluate all the attacks on 3 benchmark datasets, CIFAR-10 [24], Tiny Image Net [26], and GTSRB [43]
Hardware Specification Yes All experiments are conducted on a server with GPU RTX 4090 and CPU AMD EPYC 7543 32-Core Processor.
Software Dependencies No The paper mentions PyTorch and Mind Spore as frameworks used for the provided code, but does not specify their version numbers or the version numbers of any other ancillary software components.
Experiment Setup Yes For our method, we choose to generate the shared adversarial example with Projected Gradient Descent (PGD) [33] with L norm. For all experiments, we run PGD 5 steps with norm bound 0.2 and we set λ1 = λ2 = λ4 = 1 and λ3 = 0.01. ... We run SAU 100 epochs in CIFAR-10 and GTSRB. In Tiny Image Net, we run SAU 20 epochs.