reproducibilityindex.ai

Differentially Private Submodular Maximization: Data Summarization in Disguise

Authors: Marko Mitrovic, Mark Bun, Andreas Krause, Amin Karbasi

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We additionally provide two concrete experiments to validate the efﬁcacy of these algorithms.
Researcher Affiliation	Academia	1Yale University, New Haven, CT, USA 2Princeton University, Princeton, NJ, USA 3ETH Zurich, Zurich, Switzerland.
Pseudocode	Yes	Algorithm 1 Diff. Private Greedy (Cardinality) GO; Algorithm 2 Differentially Private Greedy (p-system) GO; Algorithm 3 Diff. Private Subsample-Greedy SGO
Open Source Code	No	The paper does not provide any concrete access information (e.g., a link or explicit statement of release) for the source code of the methodology described.
Open Datasets	Yes	We analyze a dataset of 10,000 Uber pickups in Manhattan in April 2014 (Uber Dataset). [...] Uber Dataset. Uber pickups in new york city. URL https://www. kaggle.com/fivethirtyeight/ uber-pickups-in-new-york-city. [...] We analyze a dataset created from a combination of National Health Examination Surveys ranging from 2007 to 2014 (NHANESDataset). [...] NHANESDataset. National health and nutrition examination survey (2007 2014). URL https://wwwn. cdc.gov/nchs/nhanes/default.aspx.
Dataset Splits	No	The paper mentions running simulations (e.g., "average utility across 100 simulations") but does not specify explicit train/validation/test splits, sample counts for each split, or cross-validation setup for the datasets used.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies	No	The paper describes the algorithms and experiments but does not list any specific software dependencies with version numbers.
Experiment Setup	Yes	We use ϵ = 0.1 and δ = 2^-20. For our settings of parameters, basic composition outperforms advanced composition, so the privacy budget of ϵ = 0.1 is split equally across the k iterations, meaning the mechanism at each iteration uses ϵ0 = ϵ/k. Our ﬁgures plot the average utility across 100 simulations. [...] We run 1,000 simulations with ϵ = 1.0 and δ = 2^-20.