Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Differentially Private Anonymized Histograms
Authors: Ananda Theertha Suresh
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Motivated by these applications, we propose the ο¬rst differentially private mechanism to release anonymized histograms that achieves near-optimal privacy utility trade-off both in terms of number of items and the privacy parameter. Further, if the underlying histogram is given in a compact format, the proposed algorithm runs in time sub-linear in the number of items. For anonymized histograms generated from unknown discrete distributions, we show that the released histogram can be directly used for estimating symmetric properties of the underlying distribution. The paper presents theoretical guarantees and algorithms (PRIVHIST) but does not include empirical studies with actual data, performance metrics, or comparisons on datasets. |
| Researcher Affiliation | Industry | Ananda Theertha Suresh Google Research, New York EMAIL |
| Pseudocode | Yes | Algorithm PRIVHIST Input: anonymized histogram h in terms of prevalences i.e., {(r, 'r) : 'r > 0}, privacy cost . Parameters: 1 = 2 = 3 = /3. Output: DP anonymized histogram H and N (an estimate of n). ... Algorithm PRIVHIST-LOWPRIVACY ... Algorithm PRIVHIST-HIGHPRIVACY |
| Open Source Code | No | The paper does not provide any explicit statement about making the source code available, nor does it include links to a code repository. |
| Open Datasets | No | The paper is theoretical and focuses on algorithm design and theoretical properties. It does not conduct empirical experiments with datasets; therefore, it does not mention public or open datasets for training. |
| Dataset Splits | No | The paper is theoretical and does not describe any empirical experiments or dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any empirical experiments that would require specific hardware. Therefore, no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and focuses on algorithm design and theoretical properties. It does not describe any empirical experiments that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe any empirical experiments. Therefore, no experimental setup details like hyperparameters or training configurations are provided. |