Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Query-Efficient Locally Private Hypothesis Selection via the Scheffe Graph
Authors: Gautam Kamath, Alireza F. Pour, Matthew Regehr, David P. Woodruff
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose an algorithm with improved query-complexity for the problem of hypothesis selection under local differential privacy constraints. Given a set of k probability distributions Q, we describe an algorithm that satisfies local differential privacy, performs O(k3/2) non-adaptive queries to individuals who each have samples from a probability distribution p, and outputs a probability distribution from the set Q which is nearly the closest to p. Previous algorithms required either ̣(k2) queries or many rounds of interactive queries. Technically, we introduce a new object we dub the Scheffé graph, which captures structure of the differences between distributions in Q, and may be of more broad interest for hypothesis selection tasks. |
| Researcher Affiliation | Academia | Gautam Kamath University of Waterloo and Vector Institute EMAIL Alireza F. Pour University of Waterloo EMAIL Matthew Regehr University of Waterloo EMAIL David P. Woodruff Carnegie Mellon University EMAIL |
| Pseudocode | No | The paper describes algorithmic steps in prose, for example, under 'Proof of Theorem 14': 'We proceed by dominating vertices with small in-degree separately from vertices with large in-degree. To that end, set r := k log k and let B be those vertices with in-degree less than r. By the previous proposition, |B| 3k3/2 log k. Now, draw uniformly at random a subset R of size ℓ:= k3/2 log k from the whole vertex set V . For a fixed v V \ B, P(v / Nout(R)) = ...' However, it does not contain any clearly labeled algorithm block or pseudocode section. |
| Open Source Code | No | The paper does not provide any specific links to source code repositories, nor does it contain any explicit statements about releasing code in supplementary materials or elsewhere. The NeurIPS checklist indicates 'NA' for questions related to code and data reproducibility, reinforcing that no code is provided. |
| Open Datasets | No | The paper is theoretical and focuses on mathematical proofs and algorithm design for hypothesis selection. It does not perform experiments on specific datasets. The NeurIPS checklist indicates 'NA' for questions related to code and data reproducibility, reinforcing that no datasets are used for empirical evaluation. |
| Dataset Splits | No | The paper is a theoretical work and does not involve experiments with empirical datasets. Therefore, there is no mention of dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is a theoretical work focusing on algorithm design and mathematical proofs. It does not present experimental results that would require specific hardware. The NeurIPS checklist indicates 'NA' for questions related to experiments and compute resources. |
| Software Dependencies | No | The paper is theoretical and does not describe any experimental implementations or specific software environments with version numbers. The NeurIPS checklist indicates 'NA' for questions related to experiments and code reproducibility. |
| Experiment Setup | No | The paper is theoretical and does not present any experimental results. Therefore, there are no details provided regarding experimental setup, hyperparameters, training configurations, or system-level settings. |