Neurotoxin: Durable Backdoors in Federated Learning

Authors: Zhengming Zhang, Ashwinee Panda, Linyue Song, Yaoqing Yang, Michael Mahoney, Prateek Mittal, Ramchandran Kannan, Joseph Gonzalez

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct an exhaustive evaluation across ten natural language processing and computer vision tasks, and we find that we can double the durability of state of the art backdoors.
Researcher Affiliation Academia 1School of Information Science and Engineering, Southeast University, China 2Department of Electrical and Computer Engineering, Princeton University 3Department of Electrical Engineering and Computer Sciences, University of California at Berkeley 4International Computer Science Institute and Department of Statistics, University of California at Berkeley.
Pseudocode Yes Algorithm 1 (Left.) Baseline attack. (Right.) Neurotoxin. The difference is the red line.
Open Source Code Yes The code to reproduce our attack results is open-sourced.
Open Datasets Yes Tasks. In Table 2 we summarize 10 tasks. Each task consists of a dataset... 1https://bigquery.cloud.google.com/dataset/fhbigquery:reddit comments ... Task 3 uses the Sentiment140 Twitter dataset (Go et al., 2009) for sentiment analysis ... Task 4 uses the IMDB movie review dataset (Maas et al., 2011) for sentiment analysis ... Computer Vision. CIFAR10, CIFAR100 (Krizhevsky et al., 2009), and EMNIST (Cohen et al., 2017) are benchmark datasets for the multiclass classification task in computer vision.
Dataset Splits No The paper does not explicitly provide the specific training, validation, and test dataset splits (e.g., percentages or sample counts) for the general federated learning models, only details about the 'poisoned dataset' used for the attack.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU/CPU models, memory, or cloud computing instance types.
Software Dependencies No We implement all methods in Py Torch (Paszke et al., 2019). (PyTorch is mentioned, but no specific version number is provided for reproducibility.)
Experiment Setup Yes We implement all methods in Py Torch (Paszke et al., 2019). ... Tasks. In Table 2 we summarize 10 tasks. Each task consists of a dataset... For all tasks, 10 devices are selected to participate in each round of FL, and we also provide results with 100 devices. ... The attacker participates in only Attack Num rounds... The smallest value of Attack Num we evaluate is 40... We implement the popular norm clipping defense (Sun et al., 2019) in all experiments. We find the smallest value of the norm clipping parameter p that does not impact convergence... We choose p which has small effect on benign test accuracy, p = 3.0 for IMDB, and p = 1.0 for CIFAR10.