Fairness for Image Generation with Uncertain Sensitive Attributes
Authors: Ajil Jalal, Sushrut Karmalkar, Jessica Hoffmann, Alex Dimakis, Eric Price
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments validate our theoretical results and achieve fair image reconstruction using state-of-the-art generative models. We implement Posterior Sampling via Langevin dynamics, study its empirical performance and compare it to PULSE with respect to our defined metrics. We do this on the MNIST (Le Cun, 1998), Flickr Faces-HQ (Karras et al., 2019) and AFHQ cat & dog (Choi et al., 2020b) datasets. |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, The University of Texas at Austin 2Department of Computer Science, The University of Texas at Austin. |
| Pseudocode | No | The paper describes algorithms (Posterior Sampling, Langevin dynamics) textually and with equations, but does not provide a formally structured pseudocode block or algorithm box. |
| Open Source Code | Yes | Our code and models are available at: https://github.com/ajiljalal/ code-cs-fairness. |
| Open Datasets | Yes | We do this on the MNIST (Le Cun, 1998), Flickr Faces-HQ (Karras et al., 2019) and AFHQ cat & dog (Choi et al., 2020b) datasets. |
| Dataset Splits | Yes | We trained Style GAN2 (Karras et al., 2020a) on the AFHQ cat & dog (Choi et al., 2020b) training set. ... for the 20% cat generator, we use 125 images of cats and all 500 images of dogs from the AFHQ dataset. Similarly, for the 80% cat generator, we use 500 images of cats and 125 images of dogs in the test set. ... We use a generator trained on 50% cats and 50% dogs, and study whether Posterior Sampling and PULSE satisfy RDP, SPE, and PR in practice. In this case, we use all images of cats and dogs from the AFHQ validation set. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running the experiments beyond mentioning general computing resources like TACC. |
| Software Dependencies | No | The paper mentions software components like "Style GAN2", "NCSNv2", "VAE", "CLIP classifier", "Resnet108" but does not specify their version numbers. |
| Experiment Setup | Yes | We implement Posterior Sampling via Langevin dynamics, which states that if x0 N(0, c In), (for c appropriately small), then we can sample from p(x|y) by running noisy gradient ascent: xt+1 xt + γt xt log p(xt|y) + p 2γt ξt where ξt N(0, In) is an i.i.d. standard Gaussian drawn at each iteration. ... Please see Appendix D for architecture-specific details. ... We trained a VAE (Kingma & Welling, 2013) on MNIST digits... We trained Style GAN2 (Karras et al., 2020a) on the AFHQ cat & dog (Choi et al., 2020b) training set. In order to study the effect of population bias on PULSE and Posterior Sampling, we trained three models on datasets with varying bias: (1) 20% cats and 80% dogs, (2) 80% cats and 20% dogs, and (3) 50% cats and 50% dogs. |