Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks

Authors: Yiwei Lu, Gautam Kamath, Yaoliang Yu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform extensive experiments to confirm our theoretical findings, test the predictability of our transition threshold, and significantly improve existing indiscriminate data poisoning baselines over a range of datasets and models.
Researcher Affiliation Collaboration 1School of Computer Science, University of Waterloo, Canada 2Vector Institute.
Pseudocode Yes Algorithm 1: Gradient Canceling(GC) Attack
Open Source Code Yes Our code is available at https://github.com/watml/plim.
Open Datasets Yes Dataset: We consider image classification on MNIST (Deng 2012) (60k training and 10k test images), CIFAR-10 (Krizhevsky 2009) (50k training and 10k test images), and Tiny Image Net (Chrabaszcz et al. 2017) (100k training, 10k validation and 10k test images).
Dataset Splits Yes For the first two datasets, we further split the training data into 70% training set and 30% validation set, respectively.
Hardware Specification Yes Hardware and package: experiments were run on a cluster with T4 and P100 GPUs.
Software Dependencies No The platform we use is Py Torch (Paszke et al. 2019).
Experiment Setup Yes Optimizer, learning rate scheduler and hyperparameters: we use SGD with momentum for optimization and the cosine learning rate scheduler (Loshchilov and Hutter 2017) for the Gradient Canceling algorithm. We set the initial learning rate as 0.5 and run 1000 epochs across every experiment.