Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets
Authors: Ruisi Cai, Zhenyu Zhang, Tianlong Chen, Xiaohan Chen, Zhangyang Wang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted with three datasets (CIFAR-10, GTSRB, Tiny Image Net), three architectures (Alex Net, Res Net-20, SENet-18), and three attacks (Bad Nets[1], clean label attack [2], and Wa Net [3]). Results consistently endorse the effectiveness of our proposed technique in backdoor model detection, with margins of 0.291 0.640 AUROC1 over the current state-of-the-arts. |
| Researcher Affiliation | Academia | 1University of Texas at Austin {ruisi.cai,zhenyu.zhang,tianlong.chen,xiaohan.chen,atlaswang}@utexas.edu |
| Pseudocode | No | The paper describes its method and procedures in natural language and figures, but it does not include a formal pseudocode block or algorithm. |
| Open Source Code | Yes | Codes are available at https://github.com/VITA-Group/Random-Shuffling-Backdoor Detect. |
| Open Datasets | Yes | Extensive experiments are conducted with three datasets (CIFAR-10, GTSRB, Tiny Image Net), three architectures (Alex Net, Res Net-20, SENet-18), and three attacks (Bad Nets[1], clean label attack [2], and Wa Net [3]). |
| Dataset Splits | No | The paper states that for detection, they split the training dataset into K subsets according to labels. However, it does not provide explicit overall training, validation, and test splits (e.g., percentages or counts) for the datasets used to train the models being analyzed. |
| Hardware Specification | Yes | We perform experiments on 8 2080Ti GPUs. |
| Software Dependencies | No | The paper mentions optimizers like SGD, but it does not specify software dependencies with version numbers (e.g., Python version, specific library versions like PyTorch or TensorFlow). |
| Experiment Setup | Yes | Table 1: Detailed training configurations of the backdoor injection procedure. Detection. To examine the reliability of the training dataset with N classes, we first divide it into N subsets based on their label. For each subset, we then feed them into the target model as well as its randomly shuffled variant, and compute the associated representation shifts over different numbers of shuffled layers. Based on our observations in Section 3.1, the last few layers mainly encode discriminate features and therefore they are used in our detection. In our implementation, we only shuffle the channel order within the last four layers and generate feature sensitivity curves as {yk[n], k = 0, ..., N 1, n = 0, 1, 2, 3}. Table 4: Detailed configurations of trigger recovery methods. |