CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning

Authors: Samuel Maddock, Alexandre Sablayrolles, Pierre Stock

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply this attack to vision models trained on CIFAR-10 and Celeb A and to language models trained on Sent140 and Shakespeare.
Researcher Affiliation Collaboration Samuel Maddock University of Warwick Alexandre Sablayrolles Meta AI Pierre Stock Meta AI
Pseudocode Yes Algorithm 1 CANIFE attack by a rogue client
Open Source Code Yes We open-source the code for CANIFE design and test to reproduce our results 2. Code available at https://github.com/facebookresearch/canife
Open Datasets Yes We utilise LEAF (Caldas et al., 2018) which provides benchmark federated datasets for simulating clients with non-IID data and a varying number of local samples. We study image classification on CIFAR10 (IID) (Krizhevsky et al., 2009) and Celeb A (non-IID) (Liu et al., 2015). We train an LSTM model on non-IID splits of Sent140 (Go et al., 2009) and Shakespeare (Mc Mahan et al., 2017a).
Dataset Splits Yes We form an IID split of 50, 000 train users and 10, 000 test users where each user holds a single sample and thus has a local batch size of 1. We use the standard non-IID LEAF split resulting in 8408 train users and 935 test users and a local batch size of 32. We use standard non-IID LEAF splits resulting in 59, 214 train users and 39, 477 test users and a local batch size of 32. We use standard non-IID LEAF splits with 1016 train users and 113 test users and a local batch size of 128.
Hardware Specification Yes All experiments were run on a single A100 40GB GPU with model training taking at most a few hours. We additionally benchmarked the average CPU time on an M1 Mac Book Air (2020) for a single design iteration.
Software Dependencies Yes For privacy accounting, we utilise the RDP accountant with subsampling (Mironov et al., 2019) implemented via the Opacus library (Yousefpour et al., 2021).
Experiment Setup Yes We use the Adam optimizer (Kingma & Ba, 2014) with learning rate β = 1 and fix C = 1. We use a client learning rate of ηC = 0.01 and server learning rate ηS = 1. We train with a client learning rate of ηC = 0.899 and a server learning rate of ηS = 0.0797. We train with a client learning rate of ηC = 5.75 and a server learning rate of ηS = 0.244. We use a client learning rate of ηC = 3 and a server learning rate of ηS = 0.524.