Neurotoxin: Durable Backdoors in Federated Learning
Authors: Zhengming Zhang, Ashwinee Panda, Linyue Song, Yaoqing Yang, Michael Mahoney, Prateek Mittal, Ramchandran Kannan, Joseph Gonzalez
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct an exhaustive evaluation across ten natural language processing and computer vision tasks, and we find that we can double the durability of state of the art backdoors. |
| Researcher Affiliation | Academia | 1School of Information Science and Engineering, Southeast University, China 2Department of Electrical and Computer Engineering, Princeton University 3Department of Electrical Engineering and Computer Sciences, University of California at Berkeley 4International Computer Science Institute and Department of Statistics, University of California at Berkeley. |
| Pseudocode | Yes | Algorithm 1 (Left.) Baseline attack. (Right.) Neurotoxin. The difference is the red line. |
| Open Source Code | Yes | The code to reproduce our attack results is open-sourced. |
| Open Datasets | Yes | Tasks. In Table 2 we summarize 10 tasks. Each task consists of a dataset... 1https://bigquery.cloud.google.com/dataset/fhbigquery:reddit comments ... Task 3 uses the Sentiment140 Twitter dataset (Go et al., 2009) for sentiment analysis ... Task 4 uses the IMDB movie review dataset (Maas et al., 2011) for sentiment analysis ... Computer Vision. CIFAR10, CIFAR100 (Krizhevsky et al., 2009), and EMNIST (Cohen et al., 2017) are benchmark datasets for the multiclass classification task in computer vision. |
| Dataset Splits | No | The paper does not explicitly provide the specific training, validation, and test dataset splits (e.g., percentages or sample counts) for the general federated learning models, only details about the 'poisoned dataset' used for the attack. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU/CPU models, memory, or cloud computing instance types. |
| Software Dependencies | No | We implement all methods in Py Torch (Paszke et al., 2019). (PyTorch is mentioned, but no specific version number is provided for reproducibility.) |
| Experiment Setup | Yes | We implement all methods in Py Torch (Paszke et al., 2019). ... Tasks. In Table 2 we summarize 10 tasks. Each task consists of a dataset... For all tasks, 10 devices are selected to participate in each round of FL, and we also provide results with 100 devices. ... The attacker participates in only Attack Num rounds... The smallest value of Attack Num we evaluate is 40... We implement the popular norm clipping defense (Sun et al., 2019) in all experiments. We find the smallest value of the norm clipping parameter p that does not impact convergence... We choose p which has small effect on benign test accuracy, p = 3.0 for IMDB, and p = 1.0 for CIFAR10. |