Few-shot Backdoor Attacks via Neural Tangent Kernels

Authors: Jonathan Hayase, Sewoong Oh

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experiment on subclasses of CIFAR-10 and Image Net with Wide Res Net-34 and Conv Ne Xt architectures on periodic and patch trigger attacks and show that NTBA-designed poisoned examples achieve, for example, an attack success rate of 90% with ten times smaller number of poison examples injected compared to the baseline.
Researcher Affiliation Academia Jonathan Hayase and Sewoong Oh Paul G. Allen School of Computer Science and Engineering University of Washington {jhayase,sewoong}@cs.washington.edu
Pseudocode Yes Algorithm 1: Greedy subset selection Input: Data (Xdpta, ydpta), number of poisons m N. Output: m poison data points X p, y p. ... Algorithm 2: Backdoor loss and gradient Input: Kernel matrix Kd,dta, data (Xdta, ydta) and (Xp, yp). Output: Backdoor design loss L and gradient L Xp .
Open Source Code Yes Our code is open sourced at https://github.com/Sewoong Lab/ntk-backdoor.
Open Datasets Yes Our network is trained with SGD on a 2 label subset of CIFAR-10 Krizhevsky (2009). ... We also attack a pretrained Conv Ne Xt Liu et al. (2022) finetuned on a 2 label subset of Image Net, following the setup of Saha et al. (2020) with details given in Appendix C.2.
Dataset Splits Yes To fairly evaluate performance, we split the CIFAR-10 training set into an inner training set and validation set containing 80% and 20% of the images respectively.
Hardware Specification No The paper mentions distributing computation across 'many GPUs' but does not provide specific models or detailed hardware specifications (e.g., GPU/CPU types, memory).
Software Dependencies No The paper mentions using JAX for autograd and the neural tangent kernel library by Novak et al. (2020), but it does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes We attack a Wide Res Net-34-5 Zagoruyko & Komodakis (2016) (d 107) with GELU activations Hendrycks & Gimpel (2016) so that our network will satisfy the smoothness assumption in 2.4.2. Additionally, we do not use batch normalization which is not yet supported by the neural tangent kernel library we use Novak et al. (2020). Our network is trained with SGD on a 2 label subset of CIFAR-10 Krizhevsky (2009). The particular pair of labels is truck and deer which was observed in Hayase et al. (2021) to be relatively difficult to backdoor since the two classes are easy to distinguish. We consider two backdoor triggers: the periodic image trigger of Barni et al. (2019) and a 3 3 checker patch applied at a random position in the image.