reproducibilityindex.ai

Differential Privacy Has Disparate Impact on Model Accuracy

Authors: Eugene Bagdasaryan, Omid Poursaeed, Vitaly Shmatikov

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically demonstrate this effect for (1) gender classiﬁcation already notorious for bias in the existing models [7] and age classiﬁcation on facial images, where DP-SGD degrades accuracy for the darker-skinned faces more than for the lighter-skinned ones; (2) sentiment analysis of tweets, where DP-SGD disproportionately degrades accuracy for users writing in African-American English; (3) species classiﬁcation on the i Naturalist dataset, where DP-SGD disproportionately degrades accuracy for the underrepresented classes; and (4) federated learning of language models, where DP-SGD disproportionately degrades accuracy for users with bigger vocabularies.
Researcher Affiliation	Academia	Eugene Bagdasaryan Cornell Tech eugene@cs.cornell.edu Omid Poursaeed Cornell Tech op63@cornell.edu Vitaly Shmatikov Cornell Tech shmat@cs.cornell.edu
Pseudocode	Yes	Algorithm 1: Differentially Private SGD (DP-SGD)
Open Source Code	No	The paper uses existing open-source frameworks like PyTorch and TensorFlow Privacy but does not state that the code for the specific methodology or experiments described in this paper is made publicly available.
Open Datasets	Yes	We use the recently released Flickr-based Diversity in Faces (Di F) dataset [27] and the UTKFace dataset [39] as another source of darker-skinned faces.
Dataset Splits	No	The paper mentions 'test set' for gender classification and implies training data, but does not explicitly provide details about a separate 'validation' dataset split for hyperparameter tuning across all experiments.
Hardware Specification	Yes	We ran them on two NVidia Titan X GPUs.
Software Dependencies	No	The paper mentions using PyTorch [32] and TF Privacy [36], but does not provide specific version numbers for these software libraries or other dependencies.
Experiment Setup	Yes	We use a Res Net18 model [18] with 11M parameters pre-trained on Image Net and train using the Adam optimizer, 0.0001 learning rate, and batch size b = 256. We run 60 epochs of DP training...