Tight and Robust Private Mean Estimation with Few Users
Authors: Shyam Narayanan, Vahab Mirrokni, Hossein Esfandiari
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work, we study high-dimensional mean estimation under user-level differential privacy, and design an (ε, δ)-differentially private mechanism using as few users as possible. In particular, we provide a nearly optimal trade-off between the number of users and the number of samples per user required for private mean estimation, even when the number of users is as low as O( 1δ ). Interestingly, this bound on the number of users is independent of the dimension (though the number of samples per user is allowed to depend polynomially on the dimension), unlike the previous work that requires the number of users to depend polynomially on the dimension. This resolves a problem first proposed by Amin et al. (2019). Moreover, our mechanism is robust against corruptions in up to 49% of the users. Finally, our results also apply to optimal algorithms for privately learning discrete distributions with few users, answering a question of Liu et al. (2020), and a broader range of problems such as stochastic convex optimization and a variant of stochastic gradient descent via a reduction to differentially private mean estimation. |
| Researcher Affiliation | Collaboration | 1Google Research, New York, NY, USA 2Massachusetts Institute of Technology, Cambridge, MA, USA. Correspondence to: Shyam Narayanan <shyamsn@mit.edu>. |
| Pseudocode | Yes | Algorithm 1 : DP-ESTIMATE-1(x1, . . . , xn, α, ε, δ). Differentially private estimation algorithm taking n = O( 1δ ) points x1, . . . , xn Rd. If there exists x within r of at least 2/3 of the points {xi}, then the algorithm outputs a point within O(r d) of x, with probability 1 α. ... Algorithm 2 : DP-ESTIMATE-2(x1, . . . , xn, α, ε, δ). Differentially private estimation algorithm taking n = O( 1α ) samples, with failure probability α and runtime only O(n1+o(1) d log 1α ). ... Algorithm 3 : DP-ESTIMATE-USER (X = {Xi,j}n m i=1,j=1), α, ε, δ, k. User-Level differentially private estimation algorithm taking O( 1ε log 1α δ) n O(d 1ε log 1δ ) users and m samples in Rd per user, and a failure probability α. |
| Open Source Code | No | The paper does not provide any links to open-source code or explicitly state that the code for the described methodology is publicly available. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments involving training data splits for its own evaluation. It mentions "iterative training procedures" in a general context, but not for its own methodology validation. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments involving validation data splits for its own evaluation. |
| Hardware Specification | No | The paper is theoretical and does not describe any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and focuses on algorithms and mathematical proofs. It does not specify any software dependencies with version numbers required for implementation or reproduction. |
| Experiment Setup | No | The paper is theoretical and does not describe any specific experimental setup details such as hyperparameters or training configurations. |