CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning
Authors: Samuel Maddock, Alexandre Sablayrolles, Pierre Stock
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply this attack to vision models trained on CIFAR-10 and Celeb A and to language models trained on Sent140 and Shakespeare. |
| Researcher Affiliation | Collaboration | Samuel Maddock University of Warwick Alexandre Sablayrolles Meta AI Pierre Stock Meta AI |
| Pseudocode | Yes | Algorithm 1 CANIFE attack by a rogue client |
| Open Source Code | Yes | We open-source the code for CANIFE design and test to reproduce our results 2. Code available at https://github.com/facebookresearch/canife |
| Open Datasets | Yes | We utilise LEAF (Caldas et al., 2018) which provides benchmark federated datasets for simulating clients with non-IID data and a varying number of local samples. We study image classification on CIFAR10 (IID) (Krizhevsky et al., 2009) and Celeb A (non-IID) (Liu et al., 2015). We train an LSTM model on non-IID splits of Sent140 (Go et al., 2009) and Shakespeare (Mc Mahan et al., 2017a). |
| Dataset Splits | Yes | We form an IID split of 50, 000 train users and 10, 000 test users where each user holds a single sample and thus has a local batch size of 1. We use the standard non-IID LEAF split resulting in 8408 train users and 935 test users and a local batch size of 32. We use standard non-IID LEAF splits resulting in 59, 214 train users and 39, 477 test users and a local batch size of 32. We use standard non-IID LEAF splits with 1016 train users and 113 test users and a local batch size of 128. |
| Hardware Specification | Yes | All experiments were run on a single A100 40GB GPU with model training taking at most a few hours. We additionally benchmarked the average CPU time on an M1 Mac Book Air (2020) for a single design iteration. |
| Software Dependencies | Yes | For privacy accounting, we utilise the RDP accountant with subsampling (Mironov et al., 2019) implemented via the Opacus library (Yousefpour et al., 2021). |
| Experiment Setup | Yes | We use the Adam optimizer (Kingma & Ba, 2014) with learning rate β = 1 and fix C = 1. We use a client learning rate of ηC = 0.01 and server learning rate ηS = 1. We train with a client learning rate of ηC = 0.899 and a server learning rate of ηS = 0.0797. We train with a client learning rate of ηC = 5.75 and a server learning rate of ηS = 0.244. We use a client learning rate of ηC = 3 and a server learning rate of ηS = 0.524. |