Differentially Private Submodular Maximization: Data Summarization in Disguise
Authors: Marko Mitrovic, Mark Bun, Andreas Krause, Amin Karbasi
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We additionally provide two concrete experiments to validate the efficacy of these algorithms. |
| Researcher Affiliation | Academia | 1Yale University, New Haven, CT, USA 2Princeton University, Princeton, NJ, USA 3ETH Zurich, Zurich, Switzerland. |
| Pseudocode | Yes | Algorithm 1 Diff. Private Greedy (Cardinality) GO; Algorithm 2 Differentially Private Greedy (p-system) GO; Algorithm 3 Diff. Private Subsample-Greedy SGO |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., a link or explicit statement of release) for the source code of the methodology described. |
| Open Datasets | Yes | We analyze a dataset of 10,000 Uber pickups in Manhattan in April 2014 (Uber Dataset). [...] Uber Dataset. Uber pickups in new york city. URL https://www. kaggle.com/fivethirtyeight/ uber-pickups-in-new-york-city. [...] We analyze a dataset created from a combination of National Health Examination Surveys ranging from 2007 to 2014 (NHANESDataset). [...] NHANESDataset. National health and nutrition examination survey (2007 2014). URL https://wwwn. cdc.gov/nchs/nhanes/default.aspx. |
| Dataset Splits | No | The paper mentions running simulations (e.g., "average utility across 100 simulations") but does not specify explicit train/validation/test splits, sample counts for each split, or cross-validation setup for the datasets used. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper describes the algorithms and experiments but does not list any specific software dependencies with version numbers. |
| Experiment Setup | Yes | We use ϵ = 0.1 and δ = 2^-20. For our settings of parameters, basic composition outperforms advanced composition, so the privacy budget of ϵ = 0.1 is split equally across the k iterations, meaning the mechanism at each iteration uses ϵ0 = ϵ/k. Our figures plot the average utility across 100 simulations. [...] We run 1,000 simulations with ϵ = 1.0 and δ = 2^-20. |