reproducibilityindex.ai

Privacy without Noisy Gradients: Slicing Mechanism for Generative Model Training

Authors: Kristjan Greenewald, Yuancheng Yu, Hao Wang, Kai Xu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive numerical experiments demonstrate that our approach can generate synthetic data of higher quality compared with baselines.
Researcher Affiliation	Collaboration	Kristjan Greenewald MIT-IBM Watson AI Lab, IBM Research kristjan.h.greenewald@ibm.com Yuancheng Yu UIUC yyu51@illinois.edu Hao Wang MIT-IBM Watson AI Lab, IBM Research hao@ibm.com Kai Xu MIT-IBM Watson AI Lab, IBM Research xuk@ibm.com
Pseudocode	Yes	Algorithm 1 Training DP generative modes with the smoothed-sliced f-divergence.
Open Source Code	No	The paper does not provide a direct link to a source code repository or an explicit statement about releasing the code for the work described in this paper.
Open Datasets	Yes	We validate both our method and baselines using the US Census data derived from the American Community Survey (ACS) Public Use Microdata Sample (PUMS). Using the API of the Folktables package [DHMS21], we access the 2018 California data.
Dataset Splits	No	The paper mentions training and testing data, and subsampling for privacy amplification, but does not explicitly provide details for a validation split or its proportion.
Hardware Specification	Yes	For our method and baselines, each model was trained using a V100 GPU, with runtimes typically less than 2 hours for our method (200 epochs).
Software Dependencies	No	The paper mentions using 'open-source Python library [Sma23]' and 'Folktables package [DHMS21]' but does not provide specific version numbers for Python, PyTorch, or other key software dependencies.
Experiment Setup	Yes	For our method and Slice Wass, all experiments used batch size of 128 and learning rate 2 × 10−5, and ran for 200 epochs.