Inspecting Prediction Confidence for Detecting Black-Box Backdoor Attacks

Authors: Tong Wang, Yuan Yao, Feng Xu, Miao Xu, Shengwei An, Ting Wang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide both theoretical and empirical evidence for the generality of this observation. DTINSPECTOR then carefully examines the prediction confidences of data samples, and decides the existence of backdoor using the shortcut nature of backdoor triggers. Extensive evaluations on six backdoor attacks, four datasets, and three advanced attacking types demonstrate the effectiveness of the proposed defense.
Researcher Affiliation Academia Tong Wang1, Yuan Yao1, Feng Xu1, Miao Xu2, Shengwei An3, Ting Wang4 1State Key Laboratory for Novel Software Technology, Nanjing University, China 2University of Queensland, Australia 3Purdue University, USA 4Stony Brook University, USA
Pseudocode No The paper describes the objective function for patch learning but does not provide any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper mentions using 'open source code provided by their authors' for existing backdoor defenses, but it does not state that the code for their own proposed method (DTINSPECTOR) is open source or provide a link.
Open Datasets Yes Datasets. We use four commonly-studied datasets in our experiments including CIFAR10 (Krizhevsky, Hinton et al. 2009), GTSRB (Stallkamp et al. 2011), Image Net (Deng et al. 2009), and Pub Fig (Kumar et al. 2009). All the datasets are publicly available.
Dataset Splits Yes Table 1: 'Dataset Train/Test Label Classifier' provides explicit train and test split sizes for each dataset, such as 'CIFAR10 50,000/10,000'.
Hardware Specification Yes The experiments are run on a machine with 20-cores Intel i9-10900KF CPU, 256GB RAM, and one NVIDIA Ge Force RTX3090 GPU.
Software Dependencies No The paper mentions using default settings for existing defenses and discusses general software components but does not provide specific version numbers for any key software dependencies used in their experiments (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup Yes For λ, we initialize it to 0.0001 and dynamically adjust it to a proper value following (Wang et al. 2019). For the sampling size |Dh|, we empirically found that 50 high-confidence samples and 50 low-confidence samples are sufficient, and thus set it to 50 by default (i.e., |Dh| = |Dl| = 50).