Discrete Distribution Estimation under Local Privacy
Authors: Peter Kairouz, Keith Bonawitz, Daniel Ramage
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Large scale simulations show that the optimal decoding algorithm for both k-RR and RAPPOR depends on the shape of the true underlying distribution. For skewed distributions, the projected estimator (introduced here) offers the best utility across a wide variety of privacy levels and sample sizes (Section 4.4). |
| Researcher Affiliation | Collaboration | Peter Kairouz KAIROUZ2@ILLINOIS.EDU Keith Bonawitz BONAWITZ@GOOGLE.COM Daniel Ramage DRAMAGE@GOOGLE.COM Google, 1600 Amphitheatre Parkway, Mountain View, CA 94043, University of Illinois, Urbana-Champaign, 1308 W Main St, Urbana, IL 61801 |
| Pseudocode | No | The paper references "Algorithm 1 of (Wang & Carreira-Perpi n an, 2013)" but does not contain structured pseudocode or algorithm blocks within its own text. |
| Open Source Code | No | The paper mentions that RAPPOR is an "open source Google technology" but does not state that the authors are releasing their own code for the methods described in this paper (k-RR, O-RR). |
| Open Datasets | No | The paper describes generating input data from various statistical distributions (e.g., "geometric distribution", "binomial distributions", "Zipf distribution", "multinomial distributions drawn from a symmetric Dirichlet distribution") for simulations, but does not refer to or provide access information for a publicly available or open dataset. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. It focuses on simulating data from distributions. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment. |
| Experiment Setup | Yes | Free parameters are set via grid search over k [2, 4, 8, . . . , 2048, 4096], c [1, 2, 4, . . . , 512, 1024], h [1, 2, 4, 8, 16] for each ε. |