Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Differentially Private Quantiles with Smaller Error
Authors: Jacob Imola, Fabrizio Boninsegna, Hannah Keller, Anders Aamand, Amrita Roy Chowdhury, Rasmus Pagh
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide experimental evaluation which confirms that our mechanism performs favorably compared to prior work in practice, in particular when the number of quantiles m is large. |
| Researcher Affiliation | Academia | Jacob Imola BARC, University of Copenhagen Denmark EMAIL Fabrizio Boninsegna University of Padova Italy EMAIL Hannah Keller Aarhus University Denmark EMAIL Anders Aamand University of Copenhagen Denmark EMAIL Amrita Roy Chowdhury University of Michigan, Ann Arbor United States of America EMAIL Rasmus Pagh BARC, University of Copenhagen Denmark EMAIL |
| Pseudocode | Yes | Algorithm 1 Slice Quantiles 1: Input: X, r1, . . . , rm, (w, ε1), (ℓ, ε2), γ, [0, b] |
| Open Source Code | Yes | For our experiments (code is open-sourced4), we use a variant of the k-ary tree CC mechanism introduced in [4] with two-sided geometric noise, and Single Quantile implemented as the exponential mechanism [18, 20]. 4https://github.com/NynsenFaber/DP_CC_quantiles |
| Open Datasets | Yes | we use a variant of the k-ary tree CC mechanism introduced in [4] with two-sided geometric noise, and Single Quantile implemented as the exponential mechanism [18, 20]. Similarly to Kaplan et al. [18], we construct two real-valued datasets by adding small Gaussian noise to the Adult Age and Adult Hours datasets [5]; both datasets, corresponding to ages and hours worked per week, exist on the interval [0, 100]; we use this as our data domain. [5] Barry Becker and Ronny Kohavi. Adult. UCI Machine Learning Repository, 1996. DOI: https://doi.org/10.24432/C5XW20. |
| Dataset Splits | No | For each m {10, 20, . . . , 200}, we randomly sample m quantiles from the set of 250 uniformly spaced quantiles {i/251 : i = 1, . . . , 250}, and run experiments on both datasets. This sampling procedure is performed independently for each experiment, ensuring that the reported results represent an average over both good and bad instantiations of the problem. |
| Hardware Specification | Yes | The experiments are not compute-intensive and were run locally on a laptop. |
| Software Dependencies | No | For our experiments (code is open-sourced4), we use a variant of the k-ary tree CC mechanism introduced in [4] with two-sided geometric noise, and Single Quantile implemented as the exponential mechanism [18, 20]. |
| Experiment Setup | Yes | Implementation details of Slice Quantiles. Our empirical results indicate that the most effective strategy for allocating the privacy budget between CC and Single Quantile is to divide it equally, assigning half to each mechanism. To compute the size of the slice, we use h = l 2 ε log 2mψ according to Theorem C.1, with β = 0.05 and ψ = b a g = 100n as [a, b] = [0, 100] and g = 1 n. We used numerical optimization of the Chernoff bound to compute the smallest possible value of the parameter w bounding the CC mechanism with failure δ, beyond the asymptotic expression given in Lemma 3.1. The privacy settings are: (1, 10-16)-DP for Slice Quantiles and AQ, and (1, 0)-DP for AQ with pure DP guarantee [18]. |