reproducibilityindex.ai

Differentially Private Quantiles

Authors: Jennifer Gillenwater, Matthew Joseph, Alex Kulesza

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now empirically evaluate Joint Exp against three alternatives: App Ind Exp, CSmooth, and Agg Tree. We evaluate our four algorithms on four datasets: synthetic Gaussian data from N(0, 5), synthetic uniform data from U( 5, 5), and real collections of book ratings and page counts from Goodreads (Soumik, 2019) (Figure 2).
Researcher Affiliation	Industry	Equal contributions, all authors at Google Research New York. Correspondence to: Jennifer Gillenwater <jengi@google.com>, Matthew Joseph <mtjoseph@google.com>, Alex Kulesza <kulesza@google.com>.
Pseudocode	Yes	Algorithm 1 Pseudocode for Joint Exp
Open Source Code	Yes	All experiment code is publicly available (Google, 2021). Google. dp multiq. https://github.com/google-research/google-research/tree/master/dp_multiq, 2021.
Open Datasets	Yes	We evaluate our four algorithms on four datasets: synthetic Gaussian data from N(0, 5), synthetic uniform data from U( 5, 5), and real collections of book ratings and page counts from Goodreads (Soumik, 2019) (Figure 2).
Dataset Splits	No	No specific dataset split information (percentages, counts, or predefined splits) for training, validation, or testing was found. The paper mentions '20 trials of 1000 random samples'.
Hardware Specification	Yes	All experiments were run on a machine with two CPU cores and 100GB RAM.
Software Dependencies	No	The paper mentions 'scipy.special.logsumexp' and refers to a 'racing sampling method' and numerical improvements, but does not provide specific version numbers for software dependencies like Python or SciPy itself.
Experiment Setup	Yes	In each case, the requested quantiles are evenly spaced. m = 1 is median estimation, m = 2 requires estimating the 33rd and 67th percentiles, and so on. We average scores across 20 trials of 1000 random samples. For every experiment, we take [ 100, 100] as the (loose) user-provided data range. For the Goodreads page numbers dataset, we also divide each value by 100 to scale the values to [ 100, 100]. Experiments for ε = 1 appear in Figure 3.