Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Towards a Defense Against Federated Backdoor Attacks Under Continuous Training

Authors: Shuaiqi Wang, Jonathan Hayase, Giulia Fanti, Sewoong Oh

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate shadow learning (with R-SPECTRE filtering) on 4 datasets: EMNIST (Cohen et al., 2017), CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), and Tiny Image Net (Le & Yang, 2015) datasets. We compare with 8 state-of-the-art defense algorithms under the federated setting: SPECTRE (gradientand representation-based))(Hayase et al., 2021), Robust Federated Aggregation (RFA) (Pillutla et al., 2019), Norm Clipping, Noise Adding (Sun et al., 2019), CRFL (Xie et al., 2021), FLAME (Nguyen et al., 2022), Multi-Krum (Blanchard et al., 2017b) and Fools Gold (Fung et al., 2018).
Researcher Affiliation Academia Shuaiqi Wang EMAIL Department of Electrical and Computer Engineering Carnegie Mellon University Jonathan Hayase EMAIL Paul G. Allen School of Computer Science & Engineering University of Washington Giulia Fanti EMAIL Department of Electrical and Computer Engineering Carnegie Mellon University Sewoong Oh EMAIL Paul G. Allen School of Computer Science & Engineering University of Washington
Pseudocode Yes Algorithm 1: Shadow learning framework Algorithm 2: Get Threshold (SPECTRE-based instantiation) Algorithm 3: Filter (SPECTRE-based instantiation) Algorithm 4: QUEscore (Hayase et al., 2021) Algorithm 5: Shadow learning framework (training) without knowing the exact target label
Open Source Code No The paper does not provide an explicit statement about releasing code, nor a link to a code repository. The closest mention is in Appendix E.0.1 Resource Costs: "All algorithms including ours are implemented and performed on a server with two Xeon Processor E5-2680 CPUs. Running all defenses for our experiments took approximately 1000 CPU-core hours." This describes the implementation but not its availability.
Open Datasets Yes We evaluate shadow learning (with R-SPECTRE filtering) on 4 datasets: EMNIST (Cohen et al., 2017), CIFAR-10, CIFAR-100 (Krizhevsky et al., 2009), and Tiny Image Net (Le & Yang, 2015) datasets.
Dataset Splits No The paper describes how data is distributed among clients for training (e.g., "100 images are distributed to each client", "partition those samples to 500 users uniformly at random"). It also mentions held-out test datasets (Tb for MTA, Tm for ASR) but does not specify the exact percentages or counts for training/validation/test splits from the overall datasets or the method for creating these splits. The text states: "To evaluate a defense, we have two held-out test datasets at the central server. The first, Tb, consists entirely of benign samples. This is used to evaluate main task accuracy (MTA)... The second dataset, Tm, consists entirely of backdoored samples. We use Tm to evaluate attack success rate (ASR)..."
Hardware Specification Yes All algorithms including ours are implemented and performed on a server with two Xeon Processor E5-2680 CPUs.
Software Dependencies No The paper does not provide specific version numbers for ancillary software components or libraries. It mentions using ResNet-18 as a model architecture but no specific software versions for its implementation, nor for any other libraries like PyTorch, TensorFlow, or Python itself.
Experiment Setup Yes For all datasets, the server randomly selects 50 clients each round, and each client trains the current model with the local data with batch size 20, learning rate 0.1, and for two iterations. The server learning rate is 0.5. The attacker tries to make 7 s predicted as 1 s for EMNIST, horses as automobiles for CIFAR-10, roses as dolphin for CIFAR-100, and bees as cats for Tiny-Image Net. The backdoor trigger is a 5x5-pixel black square at the bottom right corner of the image. An α fraction of the clients are chosen to be malicious, who are given 10 corrupted samples. We set the malicious rate α as its upper bound, i.e., α = α. We set the retraining threshold ϵ1 as 2%, and the convergence threshold ϵ2 as 0.05%. We let the dimensionality reduction parameter k be 32 and set the QUE parameter β as 4 in Algorithm 1.