Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Indiscriminate Data Poisoning Attacks on Neural Networks

Authors: Yiwei Lu, Gautam Kamath, Yaoliang Yu

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present efficient implementations by parameterizing the attacker and allowing simultaneous and coordinated generation of tens of thousands of poisoned points, in contrast to most existing methods that generate poisoned points one by one. We further perform extensive experiments that empirically explore the effect of data poisoning attacks on deep neural networks. Our paper sets a new benchmark on the possibility of performing indiscriminate data poisoning attacks on modern neural networks.
Researcher Affiliation Academia Yiwei Lu EMAIL University of Waterloo Gautam Kamath EMAIL University of Waterloo Yaoliang Yu EMAIL University of Waterloo
Pseudocode Yes Algorithm 1: TGDA Attack Input: Training set Dtr = {xi, yi}N i=1, validation set Dv, training steps T, attacker step size α, attacker number of steps m, defender step size β, defender number of steps n, poisoning fraction ε, L with θpre and ℓ= L(Dv, w ) , F with wpre and f = L(Dtr Dp, w). 1 Initialize poisoned data set D0 p {(x 1, y 1), ..., (x εN, y εN)} 2 for t = 1, ..., T do 3 for i = 1, ..., m do 4 θ θ + αDθℓ(θ, wt) // TGA on L 5 for j = 1, ..., n do 6 w w β wf(θ, w) // GD on F 7 return model Lθ and poisoned set Dp = Lθ(D0 p)
Open Source Code No The paper does not provide an explicit statement about open-sourcing their code or a link to a repository for their described methodology. It only mentions following an implementation for a defense in a GitHub repository: "We follow the implementation in https://github.com/Yunodo/maxup"
Open Datasets Yes Dataset: We consider image classification on MNIST (Deng, 2012) (60,000 training and 10,000 test images), and CIFAR-10 (Krizhevsky, 2009) (50,000 training and 10,000 test images) datasets.
Dataset Splits Yes Training and validation set: During the attack, we need to split the clean training data into the training set Dtr and validation set Dv. Here we split the data to 70% training and 30% validation, respectively. Thus, for the MNIST dataset, we have |Dtr| = 42000 and |Dv| = 18000. For the CIFAR-10 dataset, we have |Dtr| = 35000 and |Dv| = 15000.
Hardware Specification Yes Hardware and package: Experiments were run on a cluster with T4 and P100 GPUs.
Software Dependencies No The platform we use is Py Torch (Paszke et al., 2019). Specifically, autodiff can be easily implemented using torch.autograd. The paper mentions PyTorch but does not specify a version number for it or any other software dependencies.
Experiment Setup Yes Hyperparameters: (1) Pretrain: we use a batch size of 1,000 for MNIST and 256 for CIFAR-10, and optimize the network using our own implementation of gradient descent with torch.autograd. We choose the learning rate as 0.1 and train for 100 epochs. (2) Attack: for the attacker, we choose α = 0.01, m = 1 by default; for the defender, we choose β = 0.1, n = 20 by default. We set the batch size to be 1,000 for MNIST; 256 for CIFAR10 and train for 200 epochs, where the attacker is updated using total gradient ascent and the defender is updated using gradient descent. We follow Zhang et al. (2021) and implement TGA using conjugate gradient. We choose the poisoning fraction ε = 3% by default. (3) Testing: we choose the exact same setting as pretrain to keep the defender s training scheme consistent.