SampDetox: Black-box Backdoor Defense via Perturbation-based Sample Detoxification

Authors: Yanxin Yang, Chentao Jia, DengKe Yan, Ming Hu, Tianlin Li, Xiaofei Xie, Xian Wei, Mingsong Chen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments demonstrate the effectiveness of Samp Detox in defending against various state-of-the-art backdoor attacks.
Researcher Affiliation Academia Yanxin Yang1, Chentao Jia1, Deng Ke Yan1, Ming Hu2 , Tianlin Li3, Xiaofei Xie2, Xian Wei1, Mingsong Chen1 1Mo E Eng. Research Center of SW/HW Co-Design Tech. and App., East China Normal University 2Singapore Management University, 3Nanyang Technological University
Pseudocode Yes Algorithm 1 details the implementation of Samp Detox.
Open Source Code Yes The source code of this work is publicly available at https://github.com/easywood0204/Samp Detox.
Open Datasets Yes We investigated three classical datasets (i.e., CIFAR-10 [42], GTSRB [43], and Tiny-Image Net [44])
Dataset Splits No For the MS-Celeb-1M dataset, we split the training and tests with a ratio of 8 : 2. Other datasets implicitly use standard splits but no explicit percentages for train/validation/test are universally provided.
Hardware Specification Yes All experiments were carried out on an Ubuntu workstation equipped with one Intel i7-13700K CPU, 64GB memory, and one NVIDIA Ge Force RTX4090 GPU.
Software Dependencies Yes we implemented our approach, i.e., Samp Detox, on top of Pytorch (version 1.13.0).
Experiment Setup Yes The number of total diffusion steps was set to 1000 for all the datasets in the experiments, the noise schedule was set as cosine , and the learning rate was set to 1e-4. Therefore, we suggest to set t1 = 20 and t2 = 120 in practice.