DP-Mix: Mixup-based Data Augmentation for Differentially Private Learning

Authors: Wenxuan Bao, Francesco Pittaluga, Vijay Kumar B G, Vincent Bindschaedler

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments, We perform ablation experiments to better understand why our methods consistently and significantly outperform the prior So TA techniques.
Researcher Affiliation Collaboration 1 University of Florida 2 NEC Labs America
Pseudocode Yes Algorithm 1 DP-SGD with mixup (DP-MIXSELF and DP-MIXDIFF).
Open Source Code Yes We open-source the code at https://github.com/wenxuan-Bao/DP-Mix.
Open Datasets Yes We use CIFAR-10, CIFAR-100, Euro SAT, Caltech 256, SUN397 and Oxford-IIIT Pet. The details of these datasets are in Appendix C in Supplemental materials.
Dataset Splits No For CIFAR-10, We use 50,000 data points for training, and 10,000 for the test set. Similar train/test splits are provided for other datasets, but no explicit validation splits are mentioned.
Hardware Specification Yes All experimental runs utilized a single A100 GPU and were based on the same task of finetuning the Vit-B-16 model on the Caltech256 dataset for 10 epochs.
Software Dependencies No The paper mentions using 'Opacus' and 'Py Torch' but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes For training from scratch experiments, we set the batch size to 4096, the number of self-augmentation to 16, the clip bound to C = 1, and the number of epochs to 200. For fine-tuning experiments, we change the batch size to 1000 and the number of epochs to 20 for Euro SAT and 10 for all other datasets.