Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Private Identity Testing for High-Dimensional Distributions
Authors: Clément L. Canonne, Gautam Kamath, Audra McMillan, Jonathan Ullman, Lydia Zakynthinou
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our main contribution is to give novel algorithms for hypothesis testing of high-dimensional distributions with improved sample complexity. In particular, we give differentially private algorithms for the following fundamental problems: |
| Researcher Affiliation | Collaboration | Clément Canonne IBM Research, Almaden EMAIL; Gautam Kamath Cheriton School of Computer Science University of Waterloo EMAIL; Audra Mc Millan Khoury College of Computer Sciences, Northeastern University Department of Computer Science, Boston University EMAIL; Jonathan Ullman Khoury College of Computer Sciences Northeastern University EMAIL; Lydia Zakynthinou Khoury College of Computer Sciences Northeastern University EMAIL |
| Pseudocode | Yes | Algorithm 1 LIPEXTTEST; Algorithm 2 Private Uniformity Testing via Lipschitz Extension |
| Open Source Code | No | The paper does not provide any links to open-source code or state that code is made available. |
| Open Datasets | No | The paper discusses theoretical sample complexity for distributions (e.g., "product distribution P over { 1}d", "multivariate Gaussian P in Rd") rather than empirical evaluation on specific named datasets. Therefore, no access information for a public dataset is provided. |
| Dataset Splits | No | The paper does not mention any training, validation, or test dataset splits, as it focuses on theoretical analysis rather than empirical experimentation. |
| Hardware Specification | No | The paper focuses on theoretical algorithms and their sample complexity. It does not mention any specific hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and describes algorithms and proofs. It does not list any software dependencies or version numbers. |
| Experiment Setup | No | The paper presents theoretical algorithms and their analysis (e.g., sample complexity). It does not describe any experimental setup details such as hyperparameters, training configurations, or system-level settings. |