Fishing for User Data in Large-Batch Federated Learning via Gradient Magnification
Authors: Yuxin Wen, Jonas A. Geiping, Liam Fowl, Micah Goldblum, Tom Goldstein
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the strategy in challenging large-scale settings, obtaining high-fidelity data extraction in both cross-device and cross-silo federated learning. Code is available at https://github. com/Jonas Geiping/breaching. We then verify in a range of experiments for the task of image classification that this attack allows us to leverage existing optimization-based (Geiping et al., 2020) and analytic attacks (Lu et al., 2021), which currently only work well for inverting an updated calculated on a single or few data points. |
| Researcher Affiliation | Academia | 1University of Maryland 2New York University. |
| Pseudocode | Yes | Detailed implementation can be found in Algorithm 1. |
| Open Source Code | Yes | Code is available at https://github. com/Jonas Geiping/breaching. |
| Open Datasets | Yes | All images are from Image Net ILSVRC 2012 (Russakovsky et al., 2015) with a size of 224 × 224 and include 1000 classes in total. |
| Dataset Splits | Yes | For our quantitative experiments we partition the Image Net validation set into 100 users with the given batch size, and either allocate each user a different class or assign images to users at random (without replacement). |
| Hardware Specification | Yes | We run all the optimization-based attacks using single 2080ti GPUs and run the analytic attacks via APRIL on CPUs, solving the embedding layer and attention inversion under-determined problems via an SVD solver (dgelss). |
| Software Dependencies | No | The paper mentions 'We implement these attacks in a Py Torch framework (Paszke et al., 2017)' but does not provide a specific version number for PyTorch or any other software library. |
| Experiment Setup | Yes | We use α = 1000 for the class fishing strategy and θ = 1000 for the feature fishing strategy... We apply both strategies to the last linear layer of a pre-trained Res Net-18 for all experiments except Section 4.4. In the optimization, we use Adam with step size 0.1 and 50 iterations of warmup over total 24K (Yin et al., 2021). The initialization is set to the pattern tiling of 4 × 4 random normal data introduced in (Wei et al., 2020). |