Tight and Robust Private Mean Estimation with Few Users

Authors: Shyam Narayanan, Vahab Mirrokni, Hossein Esfandiari

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work, we study high-dimensional mean estimation under user-level differential privacy, and design an (ε, δ)-differentially private mechanism using as few users as possible. In particular, we provide a nearly optimal trade-off between the number of users and the number of samples per user required for private mean estimation, even when the number of users is as low as O( 1δ ). Interestingly, this bound on the number of users is independent of the dimension (though the number of samples per user is allowed to depend polynomially on the dimension), unlike the previous work that requires the number of users to depend polynomially on the dimension. This resolves a problem first proposed by Amin et al. (2019). Moreover, our mechanism is robust against corruptions in up to 49% of the users. Finally, our results also apply to optimal algorithms for privately learning discrete distributions with few users, answering a question of Liu et al. (2020), and a broader range of problems such as stochastic convex optimization and a variant of stochastic gradient descent via a reduction to differentially private mean estimation.
Researcher Affiliation Collaboration 1Google Research, New York, NY, USA 2Massachusetts Institute of Technology, Cambridge, MA, USA. Correspondence to: Shyam Narayanan <shyamsn@mit.edu>.
Pseudocode Yes Algorithm 1 : DP-ESTIMATE-1(x1, . . . , xn, α, ε, δ). Differentially private estimation algorithm taking n = O( 1δ ) points x1, . . . , xn Rd. If there exists x within r of at least 2/3 of the points {xi}, then the algorithm outputs a point within O(r d) of x, with probability 1 α. ... Algorithm 2 : DP-ESTIMATE-2(x1, . . . , xn, α, ε, δ). Differentially private estimation algorithm taking n = O( 1α ) samples, with failure probability α and runtime only O(n1+o(1) d log 1α ). ... Algorithm 3 : DP-ESTIMATE-USER (X = {Xi,j}n m i=1,j=1), α, ε, δ, k. User-Level differentially private estimation algorithm taking O( 1ε log 1α δ) n O(d 1ε log 1δ ) users and m samples in Rd per user, and a failure probability α.
Open Source Code No The paper does not provide any links to open-source code or explicitly state that the code for the described methodology is publicly available.
Open Datasets No The paper is theoretical and does not conduct experiments involving training data splits for its own evaluation. It mentions "iterative training procedures" in a general context, but not for its own methodology validation.
Dataset Splits No The paper is theoretical and does not conduct experiments involving validation data splits for its own evaluation.
Hardware Specification No The paper is theoretical and does not describe any specific hardware used for experiments.
Software Dependencies No The paper is theoretical and focuses on algorithms and mathematical proofs. It does not specify any software dependencies with version numbers required for implementation or reproduction.
Experiment Setup No The paper is theoretical and does not describe any specific experimental setup details such as hyperparameters or training configurations.