reproducibilityindex.ai

Fairness for Image Generation with Uncertain Sensitive Attributes

Authors: Ajil Jalal, Sushrut Karmalkar, Jessica Hoffmann, Alex Dimakis, Eric Price

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments validate our theoretical results and achieve fair image reconstruction using state-of-the-art generative models. We implement Posterior Sampling via Langevin dynamics, study its empirical performance and compare it to PULSE with respect to our deﬁned metrics. We do this on the MNIST (Le Cun, 1998), Flickr Faces-HQ (Karras et al., 2019) and AFHQ cat & dog (Choi et al., 2020b) datasets.
Researcher Affiliation	Academia	1Department of Electrical and Computer Engineering, The University of Texas at Austin 2Department of Computer Science, The University of Texas at Austin.
Pseudocode	No	The paper describes algorithms (Posterior Sampling, Langevin dynamics) textually and with equations, but does not provide a formally structured pseudocode block or algorithm box.
Open Source Code	Yes	Our code and models are available at: https://github.com/ajiljalal/ code-cs-fairness.
Open Datasets	Yes	We do this on the MNIST (Le Cun, 1998), Flickr Faces-HQ (Karras et al., 2019) and AFHQ cat & dog (Choi et al., 2020b) datasets.
Dataset Splits	Yes	We trained Style GAN2 (Karras et al., 2020a) on the AFHQ cat & dog (Choi et al., 2020b) training set. ... for the 20% cat generator, we use 125 images of cats and all 500 images of dogs from the AFHQ dataset. Similarly, for the 80% cat generator, we use 500 images of cats and 125 images of dogs in the test set. ... We use a generator trained on 50% cats and 50% dogs, and study whether Posterior Sampling and PULSE satisfy RDP, SPE, and PR in practice. In this case, we use all images of cats and dogs from the AFHQ validation set.
Hardware Specification	No	The paper does not explicitly describe the hardware used for running the experiments beyond mentioning general computing resources like TACC.
Software Dependencies	No	The paper mentions software components like "Style GAN2", "NCSNv2", "VAE", "CLIP classiﬁer", "Resnet108" but does not specify their version numbers.
Experiment Setup	Yes	We implement Posterior Sampling via Langevin dynamics, which states that if x0 N(0, c In), (for c appropriately small), then we can sample from p(x\|y) by running noisy gradient ascent: xt+1 xt + γt xt log p(xt\|y) + p 2γt ξt where ξt N(0, In) is an i.i.d. standard Gaussian drawn at each iteration. ... Please see Appendix D for architecture-speciﬁc details. ... We trained a VAE (Kingma & Welling, 2013) on MNIST digits... We trained Style GAN2 (Karras et al., 2020a) on the AFHQ cat & dog (Choi et al., 2020b) training set. In order to study the effect of population bias on PULSE and Posterior Sampling, we trained three models on datasets with varying bias: (1) 20% cats and 80% dogs, (2) 80% cats and 20% dogs, and (3) 50% cats and 50% dogs.