reproducibilityindex.ai

DP-Mix: Mixup-based Data Augmentation for Differentially Private Learning

Authors: Wenxuan Bao, Francesco Pittaluga, Vijay Kumar B G, Vincent Bindschaedler

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Experiments, We perform ablation experiments to better understand why our methods consistently and significantly outperform the prior So TA techniques.
Researcher Affiliation	Collaboration	1 University of Florida 2 NEC Labs America
Pseudocode	Yes	Algorithm 1 DP-SGD with mixup (DP-MIXSELF and DP-MIXDIFF).
Open Source Code	Yes	We open-source the code at https://github.com/wenxuan-Bao/DP-Mix.
Open Datasets	Yes	We use CIFAR-10, CIFAR-100, Euro SAT, Caltech 256, SUN397 and Oxford-IIIT Pet. The details of these datasets are in Appendix C in Supplemental materials.
Dataset Splits	No	For CIFAR-10, We use 50,000 data points for training, and 10,000 for the test set. Similar train/test splits are provided for other datasets, but no explicit validation splits are mentioned.
Hardware Specification	Yes	All experimental runs utilized a single A100 GPU and were based on the same task of finetuning the Vit-B-16 model on the Caltech256 dataset for 10 epochs.
Software Dependencies	No	The paper mentions using 'Opacus' and 'Py Torch' but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	For training from scratch experiments, we set the batch size to 4096, the number of self-augmentation to 16, the clip bound to C = 1, and the number of epochs to 200. For fine-tuning experiments, we change the batch size to 1000 and the number of epochs to 20 for Euro SAT and 10 for all other datasets.