reproducibilityindex.ai

Pre-training Differentially Private Models with Limited Public Data

Authors: Zhiqi Bu, Xinwei Zhang, Sheng Zha, Mingyi Hong, George Karypis

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, using only 10% of public data and 90% of private data, our strategy can achieve DP accuracy of 41.5% on Image Net-21k (with ϵ = 8), as well as non-DP accuracy of 55.7% and 60.0% on downstream tasks Places365 and i Naturalist-2021, respectively, on par with state-of-the-art standard pretraining and substantially outperforming existing DP pre-trained models.
Researcher Affiliation	Collaboration	Zhiqi Bu Amazon Xinwei Zhang University of Southern California Sheng Zha Amazon Mingyi Hong University of Minnesota George Karypis Amazon
Pseudocode	Yes	Algorithm 1 DP continual pre-training
Open Source Code	Yes	Our DP pre-trained models are released in fast DP library (https://github.com/ awslabs/fast-differential-privacy/releases/tag/v2.1).
Open Datasets	Yes	We use Image Net-1k (1.3M images, 1k classes; [25]) for public pretraining, then Image Net-11k (formally known as Image Net-21k-P5, 11M images, 11k classes; [70]) for private pre-training.
Dataset Splits	No	The paper mentions training and testing sets (e.g., '50,000 training and 10,000 test images' for CIFAR-10/100, and 'train:test=10.5M :0.52M' for Image Net-11k), but does not explicitly provide percentages or counts for a separate validation split for the primary model training.
Hardware Specification	No	The paper mentions 'multi-GPU distributed system' and 'GPU memory' but does not provide specific details on the CPU or GPU models used (e.g., NVIDIA A100, Tesla V100, Intel Xeon, etc.) or other detailed hardware specifications.
Software Dependencies	Yes	Our DP pre-trained models are released in fast DP library (https://github.com/ awslabs/fast-differential-privacy/releases/tag/v2.1).
Experiment Setup	Yes	We employ Adam W optimizer with batch size B = 4096 and learning rate η = 0.0002 set by the line search.