Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Robust and differentially private mean estimation
Authors: Xiyang Liu, Weihao Kong, Sham Kakade, Sewoong Oh
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments support our theoretical claims. The left figure with (α, ε, δ, n) = (0.05, 20, 0.01, 106) is in the large α regime where the DP Mean error is dominates by α d and PRIME error by α p log(1/α). Hence, PRIME error is constant whereas DP Mean error increases with the dimension d. |
| Researcher Affiliation | Academia | Xiyang Liu, Weihao Kong, Sham Kakade, Sewoong Oh Paul G. Allen School of Computer Science and Engineering, University of Washington EMAIL |
| Pseudocode | Yes | We introduce PRIME (PRIvate and robust Mean Estimation) in 2.3 with details in Algorithm 9 in Appendix E.1, to achieve computational efficiency. We present here the interactive version from the perspective of an analyst accessing the dataset via DP queries (qrange, qsize, qmean, qnorm and q PCA), because this version makes clear the inner operations of each private mechanisms, hence making (i) the sensitivity analysis transparent, (ii) checking the correctness of privacy guarantees easy, and (iii) tracking privacy accountant simple. In practice, one should implement the centralized version (Algorithm 7 in Appendix D), which is significantly more efficient. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | All experiments are performed on synthetic data. We choose µ = 0 and σ = 1. The samples are drawn from N(0, Id) unless otherwise specified. The paper does not provide a link or specific details for accessing this generated data. |
| Dataset Splits | No | The paper describes numerical experiments on synthetic data but does not specify training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | The left figure with (α, ε, δ, n) = (0.05, 20, 0.01, 106) is in the large α regime... The second figure with (α, ε, δ, n) = (0.001, 20, 0.01, 106) is in the small α regime... The right figure with (α, δ, d, n) = (0.1, 0.01, 10, 106)... Details of the experiments are in Appendix L. All experiments are performed on synthetic data. We choose µ = 0 and σ = 1. The samples are drawn from N(0, Id) unless otherwise specified. |