Privacy for Free: How does Dataset Condensation Help Privacy?
Authors: Tian Dong, Bo Zhao, Lingjuan Lyu
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also empirically validate the visual privacy and membership privacy of DC-synthesized data by launching both the loss-based and the state-of-the-art likelihood-based membership inference attacks. Through empirical evaluations on image datasets, we validate that DC-synthesized data can preserve both data efficiency and membership privacy when being used for model training. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science and Engineering, Shanghai Jiao Tong University 2School of Informatics, The University of Edinburgh 3Sony AI. |
| Pseudocode | No | The paper contains mathematical equations and descriptions of methods but no explicit pseudocode or algorithm blocks are provided. |
| Open Source Code | No | The paper does not provide any explicit statements about making the source code available or links to a code repository for the described methodology. |
| Open Datasets | Yes | We use three datasets: Fashion MNIST (Xiao et al., 2017), CIFAR-10 (Krizhevsky et al., 2009) and Celeb A (Liu et al., 2015) for gender classification. |
| Dataset Splits | No | The paper describes how datasets are split for member and non-member data for MIA evaluation (e.g., 'randomly split it into two subsets of equal amount of samples'), but it does not explicitly define or specify a 'validation' dataset split for model training purposes, separate from the training and testing sets. |
| Hardware Specification | No | The paper states 'All experiments are conducted with Pytorch 1.10 on a Ubuntu 20.04 server.' but does not provide specific hardware details such as GPU/CPU models, memory, or processor types. |
| Software Dependencies | Yes | All experiments are conducted with Pytorch 1.10 on a Ubuntu 20.04 server. |
| Experiment Setup | Yes | One important hyperparameter of DSA, DM and KIP is the ratio of image per class ripc = |S| / |T|. We evaluate ripc = 0.002, 0.01 for all methods, and for DM we add an extra evaluation ripc = 0.02... We reproduce DM (Zhao & Bilen, 2021a) and adopt large learning rates to accelerate the condensation (i.e., 10, 50, 100 as learning rate for ripc = 0.002, 0.01, 0.02, respectively). ...set learning rate 0.04 and 0.1 for ripc = 0.002 and 0.01, respectively. |