Privacy for Free: Posterior Sampling and Stochastic Gradient Monte Carlo

Authors: Yu-Xiang Wang, Stephen Fienberg, Alex Smola

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate how our proposed methods work in practice, we selected two binary classification datasets: Abalone and Adult, from the first page of UCI Machine Learning Repository and performed privacy constrained logistic regression on them. Specifically, we compared two of our proposed methods, OPS mechanism and hybrid algorithm against the state-of-the-art empirical risk minimization algorithm OBJPERT (Chaudhuri et al., 2011; Kifer et al., 2012) under varying level of differential privacy protection. The results are shown in Figure 1.
Researcher Affiliation Collaboration Yu-Xiang Wang YUXIANGW@CS.CMU.EDU Stephen E. Fienberg], FIENBERG@STAT.CMU.EDU Alexander J. Smola , ALEX@SMOLA.ORG Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA ]Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213, USA Marianas Labs Inc., Pittsburgh, PA 15213, USA
Pseudocode Yes Algorithm 1 One-Posterior Sample (OPS ) estimator, Algorithm 2 Differentially Private Stochastic Gradient Langevin Dynamics (DP-SGLD), Algorithm 3 Hybrid Posterior Sampling Algorithm
Open Source Code No The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper.
Open Datasets Yes To evaluate how our proposed methods work in practice, we selected two binary classification datasets: Abalone and Adult, from the first page of UCI Machine Learning Repository
Dataset Splits No The paper describes using datasets but does not explicitly provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup Yes Minibatch size and number of data passes in the hybrid DP-SGNHT are chosen to be both . All optimization based methods are solved using BFGS algorithm to high numerical accuracy.