reproducibilityindex.ai

Differential Privacy has Bounded Impact on Fairness in Classification

Authors: Paul Mangold, Michaël Perrot, Aurélien Bellet, Marc Tommasi

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we numerically illustrate the upper bounds from Section 4.2. We use the celeb A (Liu et al., 2015) and folktables (Ding et al., 2021) datasets... In Table 1, we compute the value of Theorem 4.4 s bounds. We learn a non-private ℓ2-regularized logistic regression model, and use it to compute the bounds...
Researcher Affiliation	Academia	1Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRISt AL, F-59000 Lille, France.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https://github.com/ pmangold/fairness-privacy.
Open Datasets	Yes	We use the celeb A (Liu et al., 2015) and folktables (Ding et al., 2021) datasets... For each dataset, we use 90% of the records for training... The celeb A dataset... can be downloaded at http://mmlab.ie.cuhk. edu.hk/projects/Celeb A.html, and the folktables dataset... can be downloaded using a Python package available here https://github.com/zykls/folktables.
Dataset Splits	No	For each dataset, we use 90% of the records for training, and the remaining 10% for empirical evaluation of the bounds.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	On each dataset, for each value of n, we train a ℓ2-regularized logistic regression model using scikit-learn (Pedregosa et al., 2011).
Experiment Setup	Yes	We train ℓ2-regularized logistic regression models, ensuring that the underlying optimization problem is 1-strongly-convex. This allows learning private models by output perturbation, for which the bound from Theorem 4.4 holds. ... For each value of n and ϵ, we plot Theorem 4.4 s theoretical guarantees... For the plots with different number of training records, we train 20 non-private models with a number of records logarithmically spaced between 10 and the number of records in the complete training set... For the plots with different privacy budgets, we use 20 values logarithmically spaced between 10 3 and 10 for both datasets.