Differential Privacy has Bounded Impact on Fairness in Classification
Authors: Paul Mangold, Michaël Perrot, Aurélien Bellet, Marc Tommasi
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we numerically illustrate the upper bounds from Section 4.2. We use the celeb A (Liu et al., 2015) and folktables (Ding et al., 2021) datasets... In Table 1, we compute the value of Theorem 4.4 s bounds. We learn a non-private ℓ2-regularized logistic regression model, and use it to compute the bounds... |
| Researcher Affiliation | Academia | 1Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRISt AL, F-59000 Lille, France. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/ pmangold/fairness-privacy. |
| Open Datasets | Yes | We use the celeb A (Liu et al., 2015) and folktables (Ding et al., 2021) datasets... For each dataset, we use 90% of the records for training... The celeb A dataset... can be downloaded at http://mmlab.ie.cuhk. edu.hk/projects/Celeb A.html, and the folktables dataset... can be downloaded using a Python package available here https://github.com/zykls/folktables. |
| Dataset Splits | No | For each dataset, we use 90% of the records for training, and the remaining 10% for empirical evaluation of the bounds. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | On each dataset, for each value of n, we train a ℓ2-regularized logistic regression model using scikit-learn (Pedregosa et al., 2011). |
| Experiment Setup | Yes | We train ℓ2-regularized logistic regression models, ensuring that the underlying optimization problem is 1-strongly-convex. This allows learning private models by output perturbation, for which the bound from Theorem 4.4 holds. ... For each value of n and ϵ, we plot Theorem 4.4 s theoretical guarantees... For the plots with different number of training records, we train 20 non-private models with a number of records logarithmically spaced between 10 and the number of records in the complete training set... For the plots with different privacy budgets, we use 20 values logarithmically spaced between 10 3 and 10 for both datasets. |