reproducibilityindex.ai

Covariance-Aware Private Mean Estimation Without Private Covariance Estimation

Authors: Gavin Brown, Marco Gaboardi, Adam Smith, Jonathan Ullman, Lydia Zakynthinou

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We present two sample-efﬁcient differentially private mean estimators for ddimensional (sub)Gaussian distributions with unknown covariance. Informally, given n d/α2 samples from such a distribution with mean µ and covariance Σ, our estimators output µ such that µ µ Σ α, where Σ is the Mahalanobis distance. All previous estimators with the same guarantee either require strong a priori bounds on the covariance matrix or require Ω(d3/2) samples. Each of our estimators is based on a simple, general approach to designing differentially private mechanisms, but with novel technical steps to make the estimator private and sample-efﬁcient. Our ﬁrst estimator samples a point with approximately maximum Tukey depth using the exponential mechanism, but restricted to the set of points of large Tukey depth. Proving that this mechanism is private requires a novel analysis. Our second estimator perturbs the empirical mean of the data set with noise calibrated to the empirical covariance, without releasing the covariance itself. For both estimators, careful preprocessing of the data is required to satisfy differential privacy.
Researcher Affiliation	Academia	Gavin Brown Department of Computer Science Boston University grbrown@bu.edu Marco Gaboardi Department of Computer Science Boston University gaboardi@bu.edu Adam Smith Department of Computer Science Boston University ads22@bu.edu Jonathan Ullman Khoury College of Computer Sciences Northeastern University jullman@ccs.neu.edu Lydia Zakynthinou Khoury College of Computer Sciences Northeastern University zakynthinou.l@northeastern.edu
Pseudocode	Yes	Algorithm 1 Restricted Exponential Mechanism AE ε,δ,t(x) ... Algorithm 2 Empirically Rescaled Gaussian Mechanism AG ε,δ,β(x)
Open Source Code	No	The paper states "Our estimators are not computationally efﬁcient. We provide ﬁnite implementations which construct and search over a sufﬁciently ﬁne grid, resulting in running time exponential in the dimension d and sample size n." but does not explicitly state that the code is publicly available. Furthermore, in the author checklist, for question 3(a) "Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)?" the answer is "[N/A]".
Open Datasets	No	The paper is theoretical and focuses on algorithm design and theoretical guarantees for (sub)Gaussian distributions. It does not describe empirical experiments involving dataset training, validation, or testing. The author checklist explicitly states "[N/A]" for questions related to running experiments, including code, data, and training details.
Dataset Splits	No	The paper is theoretical and focuses on algorithm design and theoretical guarantees for (sub)Gaussian distributions. It does not describe empirical experiments involving dataset training, validation, or testing. The author checklist explicitly states "[N/A]" for questions related to running experiments, including training details.
Hardware Specification	No	The paper is theoretical and does not describe empirical experiments, so it does not specify any hardware used. The author checklist explicitly states "[N/A]" for questions related to computing resources for experiments.
Software Dependencies	No	The paper is theoretical and does not describe empirical experiments, so it does not list any specific software dependencies with version numbers. The author checklist explicitly states "[N/A]" for questions related to computing resources for experiments.
Experiment Setup	No	The paper is theoretical and focuses on algorithm design and theoretical guarantees. It does not describe empirical experiments or their setup, such as hyperparameter values or training configurations. The author checklist explicitly states "[N/A]" for questions related to training details for experiments.