Bounding User Contributions: A Bias-Variance Trade-off in Differential Privacy

Authors: Kareem Amin, Alex Kulesza, Andres Munoz, Sergei Vassilvtiskii

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Figure 1. Error when privately estimating the number of ratings in the Movie Lens dataset using different truncation strategies, averaged over 1000 runs.
Researcher Affiliation Industry 1Google Research New York, NY, USA. Correspondence to: Kareem Amin <kamin@google.com>, Alex Kulesza <kulesza@google.com>, Andres Mu noz Medina <ammedina@google.com>, Sergei Vassilvitskii <sergeiv@google.com>.
Pseudocode Yes Algorithm 1 Contribution-bounded ERM. Input: Dataset S = (z1, . . . , zn) drawn from D, contribution fraction 0 > 0, privacy parameter > 0. Construct S where = 0n. Run -differentially private ERM on S to obtain hpriv. return hpriv.
Open Source Code No The paper does not provide any explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes Example. The Movie Lens 20M dataset consists of 20 million movie ratings from 138 thousand users. 2https://grouplens.org/datasets/movielens/
Dataset Splits No The paper refers to empirical risk minimization and general learning algorithms, but does not provide specific details on training, validation, or test dataset splits used for its own experiments, such as percentages, sample counts, or splitting methodology.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory configurations used for running experiments.
Software Dependencies No The paper does not mention any specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their versions) that would allow for replication of the experimental setup.
Experiment Setup Yes Figure 1. Error when privately estimating the number of ratings in the Movie Lens dataset using different truncation strategies, averaged over 1000 runs. We compare three methods of contribution bounding, followed by the Laplace mechanism: (1) truncation of {xi} to the (1 1/n )-quantile, as suggested by our analysis, (2) the commonly used truncation at the median, and (3) truncation at the 95% quantile.