Label Poisoning is All You Need

Authors: Rishi Jha, Jonathan Hayase, Sewoong Oh

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We introduce a novel approach to design label-only backdoor attacks, which we call FLIP, and demonstrate its strengths on three datasets (CIFAR-10, CIFAR-100, and Tiny Image Net) and four architectures (Res Net-32, Res Net-18, VGG-19, and Vision Transformer). With only 2% of CIFAR-10 labels corrupted, FLIP achieves a near-perfect attack success rate of 99.4% while suffering only a 1.8% drop in the clean test accuracy.
Researcher Affiliation Academia Rishi D. Jha* Jonathan Hayase* Sewoong Oh Paul G. Allen School of Computer Science & Engineering University of Washington, Seattle {rjha01, jhayase, sewoong}@cs.washington.edu
Pseudocode Yes Algorithm 1: Step 2 of Flipping Labels to Inject Poison (FLIP): trajectory matching
Open Source Code No The paper does not contain an explicit statement about releasing the source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets Yes We evaluate FLIP on three standard datasets: CIFAR-10, CIFAR-100, and Tiny-Image Net. Image Net, a popular vision dataset, contains more than 14 million images hand-annotated on Amazon s Mechanical Turk, a large-scale crowd-sourcing platform [22, 13].
Dataset Splits No The paper mentions evaluating on "clean test data" and "clean test set" (Sct in Eq 1) and uses standard datasets like CIFAR-10, CIFAR-100, and Tiny Image Net. It states "For better test performance and to simulate real-world use cases, we follow the standard CIFAR data augmentation procedure..." but does not explicitly provide specific training/validation/test dataset split percentages or sample counts for reproduction.
Hardware Specification Yes Each experiment was run on a single, randomly-selected GPU on a cluster containing NVIDIA A40 and 2080ti GPUs.
Software Dependencies No The paper mentions using 'Py Torch transforms' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup Yes In the Res Net experiments, the expert and user models were trained using SGD with a batch size of 256, starting learning rate of γ = 0.1 (scheduled to reduce by a factor of 10 at epoch 75 and 150), weight decay of λ = 0.0002, and Nesterov momentum of µ = 0.9. For the larger VGG and Vi T models the learning rate and weight decay were adjusted as follows γ = 0.01, 0.05 and λ = 0.0002, 0.0005, respectively.