Attack of the Tails: Yes, You Really Can Backdoor Federated Learning
Authors: Hongyi Wang, Kartik Sreenivasan, Shashank Rajput, Harit Vishwakarma, Saurabh Agarwal, Jy-yong Sohn, Kangwook Lee, Dimitris Papailiopoulos
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We couple our theoretical results with a new family of backdoor attacks, which we refer to as edge-case backdoors. We further exhibit that, with careful tuning at the side of the adversary, one can insert them across a range of machine learning tasks (e.g., image classification, OCR, text prediction, sentiment analysis), and bypass state-of-the-art defense mechanisms. ... 4 Experiments. The goal of our empirical study is to highlight the effectiveness of edge-case attack against the state of the art (SOTA) of FL defenses. We conduct our experiments on real world datasets, and a simulated FL environment. Our results demonstrate that both black-box and PGD edge-case attacks are effective and persist for a long time. |
| Researcher Affiliation | Academia | Hongyi Wangw, Kartik Sreenivasanw, Shashank Rajputw, Harit Vishwakarmaw Saurabh Agarwalw, Jy-yong Sohnk, Kangwook Leew, Dimitris Papailiopoulosw w University of Wisconsin-Madison k Korea Advanced Institute of Science and Technology |
| Pseudocode | No | The paper describes attack strategies in detail (e.g., Black-box attack, PGD attack, PGD attack with model replacement) but does not provide any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our implementation is publicly available to reproduce all experimental results 1. 1https://github.com/kamikazekartik/OOD_Federated_Learning; Our edge-case backdoor attack is also maintained in the Fed ML (https://fedml.ai/) framework [76]. |
| Open Datasets | Yes | We consider the following five tasks with various values of K (num. of clients) and m (num. of clients in each iteration): (Task 1) Image classification on CIFAR-10 [77] with VGG-9 [78] (K = 200, m = 10), (Task 2) Digit classification on EMNIST [79] with Le Net [80] (K = 3383, m = 30), (Task 3) Image classification on Image Net (ILSVRC2012) [81] with VGG-11 (K = 1000, m = 10), (Task 4) Sentiment classification on Sentiment140 [82] with LSTM [83] (K = 1948, m = 10), and (Task 5) Next Word prediction on the Reddit dataset [13, 66] with LSTM (K = 80, 000, m = 100). ... Constructing Dedge We manually construct Dedge for each task as follows: (Task 1) We collect images of Southwest Airline s planes and label them as truck ; (Task 2) We take images of 7 s from Ardis [84] (a dataset extracted from 15.000 Swedish church records which were written by different priests with various handwriting styles in the nineteenth and twentieth centuries) and label them as 1 ; (Task 3) We collect images of people in certain ethnic dresses and assign a completely irrelevant label; (Task 4) We scrape tweets containing the name of Greek film director, Yorgos Lanthimos, along with positive sentiment comments and label them negative ; and (Task 5) We construct various prompts containing the city Athens and choose a target word so as to make the sentence bare negative connotation. |
| Dataset Splits | No | The paper mentions using a "MNIST test set" and also creating specific edge-case datasets (Dedge) by combining data points from existing datasets (D) but does not provide explicit train/validation/test splits, percentages, or methodology for creating these splits from the main datasets for reproducibility beyond referring to the 'test set' of specific datasets or varying the ratio of Dedge in the attacker's dataset. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions that their "edge-case backdoor attack is also maintained in the Fed ML (https://fedml.ai/) framework [76]", but it does not provide specific version numbers for Fed ML or any other software, libraries, or dependencies used in the experiments. |
| Experiment Setup | Yes | We consider the following five tasks with various values of K (num. of clients) and m (num. of clients in each iteration): (Task 1) Image classification on CIFAR-10 [77] with VGG-9 [78] (K = 200, m = 10), (Task 2) Digit classification on EMNIST [79] with Le Net [80] (K = 3383, m = 30), (Task 3) Image classification on Image Net (ILSVRC2012) [81] with VGG-11 (K = 1000, m = 10), (Task 4) Sentiment classification on Sentiment140 [82] with LSTM [83] (K = 1948, m = 10), and (Task 5) Next Word prediction on the Reddit dataset [13, 66] with LSTM (K = 80, 000, m = 100). All the other hyperparameters are provided in the appendix. ... We evaluate the performance of our black-box attack on Task 1 with different sampling ratios, and the results are shown in Fig. 3. ... We vary the percentage of samples from Dedge split across the adversary and honest clients as p% and (100 p)% respectively for p = 100%, 50%, and 10% (the detailed experimental setup can be found in the Appendix). ... We study the effectiveness of the edge-case attack under various attacking frequencies under both fixed-frequency attack (with frequency various in range of 0.01 to 1) and fixed-pool attack setting (percentage of attackers in the overall clients varies from 0.5% to 5%). |