reproducibilityindex.ai

Differentially Private Testing of Identity and Closeness of Discrete Distributions

Authors: Jayadev Acharya, Ziteng Sun, Huanyu Zhang

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We derive upper and lower bounds on the sample complexity of both the problems under (Á, )-diﬀerential privacy. We provide sample optimal algorithms for identity testing problem for all parameter ranges, and the ﬁrst results for closeness testing. Our closeness testing bounds are optimal in the sparse regime where the number of samples is at most k. Our upper bounds are obtained by privatizing non-private estimators for these problems. The non-private estimators are chosen to have small sensitivity. We propose a general framework to establish lower bounds on the sample complexity of statistical tasks under diﬀerential privacy. We show a bound on diﬀerentially private algorithms in terms of a coupling between the two hypothesis classes we aim to test. By carefully constructing chosen priors over the hypothesis classes, and using Le Cam s two point theorem we provide a general mechanism for proving lower bounds. We believe that the framework can be used to obtain strong lower bounds for other statistical tasks under privacy.
Researcher Affiliation	Academia	Jayadev Acharya ú Cornell University acharya@cornell.edu Ziteng Sun ú Cornell University zs335@cornell.edu Huanyu Zhang ú Cornell University hz388@cornell.edu
Pseudocode	Yes	Algorithm 1 Uniformity testing and Algorithm 2.
Open Source Code	No	The paper is theoretical and focuses on algorithm design, analysis, and proving bounds. It does not mention providing open-source code for its described methodologies.
Open Datasets	No	The paper is theoretical and does not conduct experiments involving datasets for training. It discusses sampling from distributions conceptually but does not refer to specific publicly available datasets.
Dataset Splits	No	The paper is theoretical and does not conduct empirical experiments that would involve training, validation, or test dataset splits.
Hardware Specification	No	The paper is theoretical and does not report on running experiments that would require specific hardware. No hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and focuses on mathematical proofs and algorithm design. It does not mention any specific software dependencies with version numbers for implementation or analysis.
Experiment Setup	No	The paper is theoretical and does not describe empirical experimental setups, including hyperparameters, training configurations, or system-level settings.