Differentially Private Markov Chain Monte Carlo
Authors: Mikko Heikkilä, Joonas Jälkö, Onur Dikmen, Antti Honkela
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In order to demonstrate our proposed method in practice, we use a simple 2-dimensional Gaussian mixture model... We use b = 1000 for the minibatches, and adjust the temperature of the chain s.t. N0 = 100 in (23)... As shown in Figure 2, the samples from the tempered chain with DP are nearly indistinguishable from the samples drawn from the non-private tempered chain. We also compared our method against DP stochastic gradient Langevin dynamics (DP SGLD) method of Li et al. [2019]. Figure 3 illustrates how the accuracy is affected by privacy. |
| Researcher Affiliation | Academia | Mikko A. Heikkilä Helsinki Institute for Information Technology HIIT, Department of Mathematics and Statistics University of Helsinki, Helsinki, Finland mikko.a.heikkila@helsinki.fi; Joonas Jälkö Helsinki Institute for Information Technology HIIT, Department of Computer Science Aalto University, Espoo, Finland joonas.jalko@aalto.fi; Onur Dikmen Center for Applied Intelligent Systems Research (CAISR) Halmstad University, Halmstad, Sweden onur.dikmen@hh.se; Antti Honkela Helsinki Institute for Information Technology HIIT, Department of Computer Science University of Helsinki, Helsinki, Finland antti.honkela@helsinki.fi |
| Pseudocode | No | The paper describes algorithms in text and mathematical formulas but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code for running all the experiments is avalaible in https://github.com/DPBayes/ DP-MCMC-Neur IPS2019. |
| Open Datasets | Yes | In order to demonstrate our proposed method in practice, we use a simple 2-dimensional Gaussian mixture model2, that has been used by Welling and Teh [2011] and Seita et al. [2017] in the non-private setting: θj N(0, σ2 j , ), j = 1, 2 xi 0.5 N(θ1, σ2 x) + 0.5 N(θ1 + θ2, σ2 x), (25) where σ2 1 = 10, σ2 2 = 1, σ2 x = 2. For the observed data, we use fixed parameter values θ = (0, 1). Following Seita et al. [2017], we generate 10^6 samples from the model to use as training data. |
| Dataset Splits | No | The paper mentions 'training data' and burning-in iterations, but it does not specify explicit validation splits or cross-validation details. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory details) used for running experiments. |
| Software Dependencies | No | The paper mentions different methods and refers to a GitHub repository for code, but it does not specify any software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | We use b = 1000 for the minibatches, and adjust the temperature of the chain s.t. N0 = 100 in (23). This corresponds to the temperature used by Seita et al. [2017] in their non-private test. To simulate this effect, we use the differentially private variational inference (DPVI) introduced by Jälkö et al. [2017] with a small privacy budget (0.22, 10-6) to find a rough estimate for the initial location. The DP MCMC method was burned in for 1 000 iterations and DP SGLD for 100 000 iterations. |