reproducibilityindex.ai

Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo

Authors: Yu-Xiang Wang, Stephen Fienberg, Alex Smola

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate how our proposed methods work in practice, we selected two binary classiﬁcation datasets: Abalone and Adult, from the ﬁrst page of UCI Machine Learning Repository and performed privacy constrained logistic regression on them. Speciﬁcally, we compared two of our proposed methods, OPS mechanism and hybrid algorithm against the state-of-the-art empirical risk minimization algorithm OBJPERT (Chaudhuri et al., 2011; Kifer et al., 2012) under varying level of differential privacy protection. The results are shown in Figure 1.
Researcher Affiliation	Collaboration	Yu-Xiang Wang YUXIANGW@CS.CMU.EDU Stephen E. Fienberg], FIENBERG@STAT.CMU.EDU Alexander J. Smola , ALEX@SMOLA.ORG Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA ]Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213, USA Marianas Labs Inc., Pittsburgh, PA 15213, USA
Pseudocode	Yes	Algorithm 1 One-Posterior Sample (OPS ) estimator, Algorithm 2 Differentially Private Stochastic Gradient Langevin Dynamics (DP-SGLD), Algorithm 3 Hybrid Posterior Sampling Algorithm
Open Source Code	No	The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper.
Open Datasets	Yes	To evaluate how our proposed methods work in practice, we selected two binary classiﬁcation datasets: Abalone and Adult, from the ﬁrst page of UCI Machine Learning Repository
Dataset Splits	No	The paper describes using datasets but does not explicitly provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup	Yes	Minibatch size and number of data passes in the hybrid DP-SGNHT are chosen to be both . All optimization based methods are solved using BFGS algorithm to high numerical accuracy.