DP-Mix: Mixup-based Data Augmentation for Differentially Private Learning
Authors: Wenxuan Bao, Francesco Pittaluga, Vijay Kumar B G, Vincent Bindschaedler
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments, We perform ablation experiments to better understand why our methods consistently and significantly outperform the prior So TA techniques. |
| Researcher Affiliation | Collaboration | 1 University of Florida 2 NEC Labs America |
| Pseudocode | Yes | Algorithm 1 DP-SGD with mixup (DP-MIXSELF and DP-MIXDIFF). |
| Open Source Code | Yes | We open-source the code at https://github.com/wenxuan-Bao/DP-Mix. |
| Open Datasets | Yes | We use CIFAR-10, CIFAR-100, Euro SAT, Caltech 256, SUN397 and Oxford-IIIT Pet. The details of these datasets are in Appendix C in Supplemental materials. |
| Dataset Splits | No | For CIFAR-10, We use 50,000 data points for training, and 10,000 for the test set. Similar train/test splits are provided for other datasets, but no explicit validation splits are mentioned. |
| Hardware Specification | Yes | All experimental runs utilized a single A100 GPU and were based on the same task of finetuning the Vit-B-16 model on the Caltech256 dataset for 10 epochs. |
| Software Dependencies | No | The paper mentions using 'Opacus' and 'Py Torch' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | For training from scratch experiments, we set the batch size to 4096, the number of self-augmentation to 16, the clip bound to C = 1, and the number of epochs to 200. For fine-tuning experiments, we change the batch size to 1000 and the number of epochs to 20 for Euro SAT and 10 for all other datasets. |