Privacy-Aware Compression for Federated Learning Through Numerical Mechanism Design

Authors: Chuan Guo, Kamalika Chaudhuri, Pierre Stock, Michael Rabbat

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we find that under both client-level and sample-level DP settings and across various benchmark datasets, the I-MVU mechanism provides a better privacy-utility trade-off than Sign SGD (Jin et al., 2020) and MVU (Chaudhuri et al., 2022) at an extremely low communication budget of one bit per gradient dimension. Moreover, I-MVU achieves close to the same performance as the standard non-compressed Laplace and Gaussian mechanisms (Abadi et al., 2016) for similar levels of (ϵ, δ)-DP, leading to new state-of-the-art results for private communication-efficient FL. 4. Experiments We evaluate the I-MVU mechanism for federated learning under the local DP setting, i.e., clients transmit the privately compressed model update M(x) to the server before aggregation.
Researcher Affiliation Industry 1Meta AI. Correspondence to: Chuan Guo <chuanguo@meta.com>, Kamalika Chaudhuri <kamalika@meta.com>.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about open-sourcing code or a link to a code repository for the described methodology.
Open Datasets Yes We first evaluate under the client-level DP setting on MNIST and CIFAR-10 (Krizhevsky et al., 2009). Here, the privacy analysis guarantees that the learning algorithm is differentially private with respect to the removal of any client. We divide the training set among the clients with client sample size 1. Each client performs a single local gradient update in every FL round. This setting is equivalent to DP-SGD training (Abadi et al., 2016) but with the Gaussian mechanism replaced by a communication-efficient private mechanism. Next, we evaluate under the sample-level DP setting on the FEMNIST dataset (Caldas et al., 2018) for classifying written characters into 62 distinct classes.
Dataset Splits Yes The dataset has a pre-defined train split with 3,500 clients, from which we randomly select 3,150 clients for training and the remaining 350 clients for testing.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models or types of computing resources used for experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers, such as library or framework versions.
Experiment Setup Yes Following (Chaudhuri et al., 2022), we train a linear model on top of Scatter Net features (Tramer & Boneh, 2020). This training recipe remains highly competitive under the central DP setting for MNIST and CIFAR-10 without leveraging any public data, hence we adopt it for FL training under local DP. Following Abadi et al. (2016), we apply L1 and L2 gradient norm clipping to control the gradient sensitivity and then apply a privacy-aware compression mechanism to transmit the clipped gradient privately. We perform a grid search over hyperparameters such as number of update rounds, step size, gradient norm clip, and mechanism parameters σ (for Gaussian and Sign SGD) and ϵ (for MVU and I-MVU); see Appendix C for details. (Appendix C contains tables of hyperparameter ranges).