reproducibilityindex.ai

Privacy for Free: How does Dataset Condensation Help Privacy?

Authors: Tian Dong, Bo Zhao, Lingjuan Lyu

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also empirically validate the visual privacy and membership privacy of DC-synthesized data by launching both the loss-based and the state-of-the-art likelihood-based membership inference attacks. Through empirical evaluations on image datasets, we validate that DC-synthesized data can preserve both data efficiency and membership privacy when being used for model training.
Researcher Affiliation	Collaboration	1Department of Computer Science and Engineering, Shanghai Jiao Tong University 2School of Informatics, The University of Edinburgh 3Sony AI.
Pseudocode	No	The paper contains mathematical equations and descriptions of methods but no explicit pseudocode or algorithm blocks are provided.
Open Source Code	No	The paper does not provide any explicit statements about making the source code available or links to a code repository for the described methodology.
Open Datasets	Yes	We use three datasets: Fashion MNIST (Xiao et al., 2017), CIFAR-10 (Krizhevsky et al., 2009) and Celeb A (Liu et al., 2015) for gender classification.
Dataset Splits	No	The paper describes how datasets are split for member and non-member data for MIA evaluation (e.g., 'randomly split it into two subsets of equal amount of samples'), but it does not explicitly define or specify a 'validation' dataset split for model training purposes, separate from the training and testing sets.
Hardware Specification	No	The paper states 'All experiments are conducted with Pytorch 1.10 on a Ubuntu 20.04 server.' but does not provide specific hardware details such as GPU/CPU models, memory, or processor types.
Software Dependencies	Yes	All experiments are conducted with Pytorch 1.10 on a Ubuntu 20.04 server.
Experiment Setup	Yes	One important hyperparameter of DSA, DM and KIP is the ratio of image per class ripc = \|S\| / \|T\|. We evaluate ripc = 0.002, 0.01 for all methods, and for DM we add an extra evaluation ripc = 0.02... We reproduce DM (Zhao & Bilen, 2021a) and adopt large learning rates to accelerate the condensation (i.e., 10, 50, 100 as learning rate for ripc = 0.002, 0.01, 0.02, respectively). ...set learning rate 0.04 and 0.1 for ripc = 0.002 and 0.01, respectively.