Disparate Impact in Differential Privacy from Gradient Misalignment
Authors: Maria S. Esipova, Atiyeh Ashari Ghomi, Yaqiao Luo, Jesse C Cresswell
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments we provide evidence that gradient misalignment is the most significant cause of unfairness, and demonstrate that global scaling can effectively reduce unfairness by aligning gradients. Our code for reproducing the experiments is provided as supplementary material. |
| Researcher Affiliation | Industry | Maria S. Esipova, Atiyeh Ashari Ghomi, Yaqiao Luo & Jesse C. Cresswell Layer 6 AI {maria, atiyeh, emily, jesse}@layer6.ai |
| Pseudocode | Yes | Algorithm 1 DPSGD |
| Open Source Code | Yes | Our code for reproducing the experiments is provided as supplementary material. |
| Open Datasets | Yes | We use an artificially unbalanced MNIST training dataset... We also use two census datasets popular in the ML fairness literature, Adult and Dutch (van der Laan, 2000)... Finally, we use the Celeb A dataset (Liu et al., 2015)... The Adult dataset is available at archive.ics.uci.edu/ml/datasets/Adult. The Dutch dataset is also available through the work of Le Quy et al. (2022) at raw.githubusercontent.com/tailequy/fairness dataset/main/Dutch census/dutch census 2001.arff. We accessed this dataset via kaggle.com/datasets/jessicali9530/celeba-dataset. |
| Dataset Splits | Yes | We use an 80/20 train/test split for both tabular datasets. The training/validation/test split is provided with the dataset and is roughly in a 80/10/10 ratio. |
| Hardware Specification | Yes | Experiments were conducted on single TITAN V GPU machines. |
| Software Dependencies | Yes | Instead, we compute the terms involving Hessians like Ha ℓg B through Hessian-vector products (HVPs) using the functorch8 library with Py Torch 1.11. |
| Experiment Setup | Yes | We set σ = 1, C0 = 0.5 for Adult, σ = 1, C0 = 0.1 for Dutch, while for MNIST and Celeb A, we set σ = 0.8 and C0 = 1... For non-global methods, the learning rate is ηt = 0.01 for all iterations t and all datasets except Dutch which has ηt = 0.8... All methods for all datasets use training and test batches of size 256. |