Discrete Distribution Estimation under Local Privacy

Authors: Peter Kairouz, Keith Bonawitz, Daniel Ramage

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Large scale simulations show that the optimal decoding algorithm for both k-RR and RAPPOR depends on the shape of the true underlying distribution. For skewed distributions, the projected estimator (introduced here) offers the best utility across a wide variety of privacy levels and sample sizes (Section 4.4).
Researcher Affiliation Collaboration Peter Kairouz KAIROUZ2@ILLINOIS.EDU Keith Bonawitz BONAWITZ@GOOGLE.COM Daniel Ramage DRAMAGE@GOOGLE.COM Google, 1600 Amphitheatre Parkway, Mountain View, CA 94043, University of Illinois, Urbana-Champaign, 1308 W Main St, Urbana, IL 61801
Pseudocode No The paper references "Algorithm 1 of (Wang & Carreira-Perpi n an, 2013)" but does not contain structured pseudocode or algorithm blocks within its own text.
Open Source Code No The paper mentions that RAPPOR is an "open source Google technology" but does not state that the authors are releasing their own code for the methods described in this paper (k-RR, O-RR).
Open Datasets No The paper describes generating input data from various statistical distributions (e.g., "geometric distribution", "binomial distributions", "Zipf distribution", "multinomial distributions drawn from a symmetric Dirichlet distribution") for simulations, but does not refer to or provide access information for a publicly available or open dataset.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. It focuses on simulating data from distributions.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup Yes Free parameters are set via grid search over k [2, 4, 8, . . . , 2048, 4096], c [1, 2, 4, . . . , 512, 1024], h [1, 2, 4, 8, 16] for each ε.