Understanding Clipping for Federated Learning: Convergence and Client-Level Differential Privacy
Authors: Xinwei Zhang, Xiangyi Chen, Mingyi Hong, Steven Wu, Jinfeng Yi
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we first empirically demonstrate that the clipped Fed Avg can perform surprisingly well even with substantial data heterogeneity when training neural networks... Based on this key observation, we provide the convergence analysis of a differential private (DP) Fed Avg algorithm... To the best of our knowledge, this is the first work that rigorously investigates theoretical and empirical issues regarding the clipping operation in FL algorithms. |
| Researcher Affiliation | Collaboration | 1Department of Electrical and Computer Engineering, University of Minnesota, MN, United States 2School of Computer Science, Carnegie Mellon University, PA, United States 3JD.com, Inc., Shanghai, China. |
| Pseudocode | Yes | Algorithm 1 Fed Avg Algorithm, Algorithm 2 Clipping-enabled Fed Avg Algorithm (CE-Fed Avg), Algorithm 3 DP-Fed Avg Algorithm |
| Open Source Code | No | No, the paper does not provide any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | Yes | We run the algorithm using Alex Net (Krizhevsky et al., 2012) and Res Net-18 (He et al., 2016) with EMNIST dataset (Cohen et al., 2017) and Cifar-10 dataset (Krizhevsky et al., 2009) for comparison... We also run the algorithm using the LSTM model used in (Reddi et al., 2021) on the NLP problem with Shakespeare dataset (Caldas et al., 2018). |
| Dataset Splits | Yes | We split the dataset in two different ways: 1) IID Data setting, where the samples are uniformly distributed to each client; 2) Non-IID Data setting, where the clients have unbalanced samples. Details are described below. For EMNIST digit classification dataset, each client has 500 samples without overlapping. In the IID case, each client has around 50 samples of each class and in the Non-IID case, there are 8 classes (each has around 5 samples) and 2 classes (each has 230 samples) on each client. |
| Hardware Specification | No | No, the paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run its experiments. |
| Software Dependencies | No | No, the paper does not provide specific software dependencies with version numbers (e.g., library names like PyTorch with a version). |
| Experiment Setup | Yes | In both experiments, we set client number N = 1920, the number of client participates in each round |Pt| = 80, t, the number of local iterations Q = 32 and the mini-batch size 64. The clipping threshold is set to 50% of the average (over clients and iterations) of local update magnitudes recorded in Fed Avg. For DP-Fed Avg we set the clipping threshold the same as in CE-Fed Avg, we fix the number of communication rounds and privacy budget for the algorithms to obtain the noise variance that needs to be added. |