Functional Renyi Differential Privacy for Generative Modeling
Authors: Dihong Jiang, Sun Sun, Yaoliang Yu
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, the new training paradigm achieves a significant improvement in privacy-utility trade-off compared to existing alternatives, especially when ϵ = 0.2. Our code is available at https://github.com/dihjiang/DP-kernel. Our method is evaluated and compared across a wide variety of image datasets and DP guarantees, where our method consistently outperforms other baselines by a large margin. |
| Researcher Affiliation | Collaboration | Dihong Jiang1,2, Sun Sun1,3 and Yaoliang Yu1,2 School of Computer Science, University of Waterloo1 Vector Institute2 National Research Council Canada3 {dihong.jiang,sun.sun,yaoliang.yu}@uwaterloo.ca |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/dihjiang/DP-kernel. |
| Open Datasets | Yes | Datasets: We consider widely used image benchmarks in related works, i.e. MNIST [15], Fashion MNIST [34], and Celeb A [20]. MNIST and Fashion MNIST are made available under Creative Commons Attribution-Share Alike 3.0 license and MIT License, respectively. The Celeb A dataset is available for non-commercial research purposes only, as described on their website. |
| Dataset Splits | Yes | For MNIST and Fashion MNIST, we generate images conditioned on 10 respective labels. For Celeb A, we condition on gender. Descriptions and pre-processing of the datasets are given in Appendix B. For MNIST and Fashion MNIST... We adopt the official training and test split. For Celeb A... We also adopt the official training, validation and test split, but randomly select 60k images from the training split as our training set. |
| Hardware Specification | Yes | All computation is conducted by one NVIDIA T4 GPU. |
| Software Dependencies | No | The paper mentions “Tensorflow privacy” but does not provide any specific version numbers for software dependencies: “We use Tensorflow privacy for computing the total privacy cost, which only requires inputting a few important parameters, e.g. subsampling rate (or batch size), noise multiplier, training epochs (iterations), target δ (δ = 10 5 in all our experiments).” |
| Experiment Setup | Yes | Our unconditional generative network is based on the official code of MMDGAN [16]. For the NN architecture, we use the same Conv2d Tranpose layer with similar depth as prior related works... All networks are optimized by RMSprop with a learning rate 5 10 5. The CNN consists of following layers: Conv2d(input_channels, 32, kernel_size=3, stride = 2, padding=1) Dropout(p=0.5) Re LU Conv2d(32, 64, kernel_size=3, stride = 2, padding=1) Dropout(p=0.5) Re LU flatten linear(flatten_dim, output_dim) Softmax. The CNN classifier is optimized by Adam with default parameters. We summarize the parameters in Table 2. |