Differential Privacy Has Disparate Impact on Model Accuracy
Authors: Eugene Bagdasaryan, Omid Poursaeed, Vitaly Shmatikov
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate this effect for (1) gender classification already notorious for bias in the existing models [7] and age classification on facial images, where DP-SGD degrades accuracy for the darker-skinned faces more than for the lighter-skinned ones; (2) sentiment analysis of tweets, where DP-SGD disproportionately degrades accuracy for users writing in African-American English; (3) species classification on the i Naturalist dataset, where DP-SGD disproportionately degrades accuracy for the underrepresented classes; and (4) federated learning of language models, where DP-SGD disproportionately degrades accuracy for users with bigger vocabularies. |
| Researcher Affiliation | Academia | Eugene Bagdasaryan Cornell Tech eugene@cs.cornell.edu Omid Poursaeed Cornell Tech op63@cornell.edu Vitaly Shmatikov Cornell Tech shmat@cs.cornell.edu |
| Pseudocode | Yes | Algorithm 1: Differentially Private SGD (DP-SGD) |
| Open Source Code | No | The paper uses existing open-source frameworks like PyTorch and TensorFlow Privacy but does not state that the code for the specific methodology or experiments described in this paper is made publicly available. |
| Open Datasets | Yes | We use the recently released Flickr-based Diversity in Faces (Di F) dataset [27] and the UTKFace dataset [39] as another source of darker-skinned faces. |
| Dataset Splits | No | The paper mentions 'test set' for gender classification and implies training data, but does not explicitly provide details about a separate 'validation' dataset split for hyperparameter tuning across all experiments. |
| Hardware Specification | Yes | We ran them on two NVidia Titan X GPUs. |
| Software Dependencies | No | The paper mentions using PyTorch [32] and TF Privacy [36], but does not provide specific version numbers for these software libraries or other dependencies. |
| Experiment Setup | Yes | We use a Res Net18 model [18] with 11M parameters pre-trained on Image Net and train using the Adam optimizer, 0.0001 learning rate, and batch size b = 256. We run 60 epochs of DP training... |