Defending against Backdoors in Federated Learning with Robust Learning Rate

Authors: Mustafa Safa Ozdayi, Murat Kantarcioglu, Yulia R. Gel9268-9276

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we first illustrate the performance of our defense, and then provide some empirical justification for its effectiveness via experimental evaluation. Our implementation is done using Py Torch (Paszke et al. 2019), and the code is available at https://github.com/TinfoilHat0/Defending-Against-Backdoors-with-Robust-Learning-Rate.
Researcher Affiliation Academia Mustafa Safa Ozdayi, Murat Kantarcioglu, Yulia R. Gel The University of Texas at Dallas {mustafa.ozdayi, muratk, ygl}@utdallas.edu
Pseudocode No The paper describes the FL protocol and its proposed update rules using mathematical equations, but it does not contain structured pseudocode or algorithm blocks (e.g., labeled 'Algorithm' or 'Pseudocode').
Open Source Code Yes Our implementation is done using Py Torch (Paszke et al. 2019), and the code is available at https://github.com/TinfoilHat0/Defending-Against-Backdoors-with-Robust-Learning-Rate.
Open Datasets Yes we use the Fashion MNIST (Xiao, Rasul, and Vollgraf 2017) dataset, and give each agent an equal number of samples from the training data via uniform sampling. For this setting, we use the Federated EMNIST dataset from the LEAF benchmark (Caldas et al. 2018). We test this attack only against Fed Avg with RLR, as other defenses already fail on default backdoor attacks, on CIFAR10 dataset (Krizhevsky, Nair, and Hinton 2009).
Dataset Splits Yes Validation and base class accuracies are computed on the validation data that comes with the used datasets, and the backdoor accuracy is computed on a poisoned validation data that is constructed by (i) extracting all base class instances from the original validation data, and (ii) adding them the trojan pattern and re-labeling them as the target class.
Hardware Specification No The paper mentions that its implementation uses PyTorch but does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No The paper states 'Our implementation is done using Py Torch (Paszke et al. 2019)', citing the PyTorch paper. However, it does not specify a version number for PyTorch or any other software dependencies, which would be necessary for exact reproduction.
Experiment Setup Yes At each round, the server uniformly samples C K agents for training where C ≥ 1. These agents locally train for E epochs with a batch size of B before sending their updates. Upon receiving and aggregating updates, we measure three key performance metrics of the model on validation data: validation accuracy, base class accuracy and backdoor accuracy. Hyperparameters used in all experiments can be found in Appendix. When there is a L2 clipping threshold M on updates, we assume M is public and every agent runs projected gradient descent to minimize their losses under this restriction, i.e., an agent ensures his update’s L2 norm is bounded by M by monitoring the L2 norm of his model during training and clips its weights appropriately. Table 1 shows specific values for M and σ used in experiments, e.g., 'Fed Avg 4 1e-3'.