Mean Estimation with User-level Privacy under Data Heterogeneity
Authors: Rachel Cummings, Vitaly Feldman, Audra McMillan, Kunal Talwar
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work we propose a simple model of heterogeneous user data that differs in both distribution and quantity of data, and we provide a method for estimating the population-level mean while preserving user-level differential privacy. We demonstrate asymptotic optimality of our estimator and also prove general lower bounds on the error achievable in our problem. |
| Researcher Affiliation | Collaboration | Rachel Cummings Department of Industrial Engineering and Operations Research Columbia University New York, NY 10027 rac2239@columbia.edu Vitaly Feldman Apple Cupertino, CA 95014 Audra Mc Millan Apple Cupertino, CA 95014 audra.mcmillan@apple.com Kunal Talwar Apple Cupertino, CA 95014 ktalwar@apple.com |
| Pseudocode | Yes | Algorithm 1 Non-private Heterogeneous Mean Estimation. Algorithm 2 Private Heterogeneous Mean Estimation. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code, nor does it include links to a code repository. |
| Open Datasets | No | The paper is theoretical and does not describe experiments performed on a specific, publicly available dataset. It discusses data in a general, abstract sense (e.g., 'user data is heterogeneous', 'each user generates multiple data points'). |
| Dataset Splits | No | The paper is theoretical and does not describe experiments that would involve dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any computational experiments or their hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not describe any computational experiments that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe any computational experiments or their setup details, such as hyperparameters or training configurations. |