Differentially Private Identity and Equivalence Testing of Discrete Distributions
Authors: Maryam Aliakbarpour, Ilias Diakonikolas, Ronitt Rubinfeld
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform an experimental evaluation of our algorithms on synthetic data. Our experiments illustrate that our private testers achieve small type I and type II errors with sample size sublinear in the domain size of the underlying distributions. |
| Researcher Affiliation | Academia | 1CSAIL, MIT, Cambridge, MA 02139, USA 2Department of Computer Science, USC, Los Angeles, CA 90089, USA 3TAU, Tel Aviv-Yafo, Israel. |
| Pseudocode | Yes | Algorithm 1 Private Uniformity Testing via Unique Elements: Private-Unique-Elements-Uniformity; Algorithm 2 Private uniformity tester based on the number of collisions: Private-Collisions-Uniformity; Algorithm 3 Private Equivalence Tester: Private Equivalence-Test |
| Open Source Code | No | The paper does not provide any explicit statements or links regarding the availability of its source code. |
| Open Datasets | No | The paper uses "synthetic data" which is mathematically defined within the paper but does not refer to a publicly available dataset with a specific link, DOI, repository, or formal citation. |
| Dataset Splits | No | The paper does not specify training, validation, or test dataset splits. It describes running algorithms multiple times to estimate error probabilities on generated samples. |
| Hardware Specification | Yes | All experiments were performed on a computer with a 1.6 GHz Intel(R) Core(TM) i5-4200U CPU and 3 GB of RAM. |
| Software Dependencies | No | The paper mentions implementing algorithms but does not provide specific version numbers for any software dependencies (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | We run our two algorithms using samples from q+ and q- with the following parameters: n = 800, 000, = 0.3, r = 300, and = 0.2. We vary the sample size staring from 50, 000 and up to 3 x 10^6, increasing it by 50, 000 at each step, and repeat the algorithm for r = 200 times to estimate the maximum of type I and type II errors. |