Finetuning Text-to-Image Diffusion Models for Fairness

Authors: Xudong Shen, Chao Du, Tianyu Pang, Min Lin, Yongkang Wong, Mohan Kankanhalli

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, our method markedly reduces gender, racial, and their intersectional biases for occupational prompts.
Researcher Affiliation Collaboration 1ISEP programme, NUS Graduate School, National University of Singapore 2Sea AI Lab, Singapore 3School of Computing, National University of Singapore xudong.shen@u.nus.edu; {duchao, tianyupang, linmin}@sea.com; yongkang.wong@nus.edu.sg; mohan@comp.nus.edu.sg
Pseudocode Yes We show implementation of adjusted DFT in Algorithm A.1.
Open Source Code Yes We share code and various fair diffusion model adaptors at https://sail-sg.github.io/finetune-fair-diffusion/.
Open Datasets Yes In our implementation, we use the Celeb A (Liu et al., 2015) and the Fair Face dataset (Karkkainen & Joo, 2021) as external faces.
Dataset Splits Yes We have another 10 occupations used for validation: ['housekeeping cleaner', 'freelance writer', 'lieutenant', 'fine artist', 'administrative law judge', 'librarian', 'sale', 'anesthesiologist', 'secondary school teacher', 'dancer'].
Hardware Specification Yes The finetuning takes around 48 hours on 8 NVIDIA A100 GPUs.
Software Dependencies No The paper mentions using 'DPM-Solver++ (Lu et al., 2022) as the diffusion scheduler' but does not specify its version number or any other software dependencies with their respective versions.
Experiment Setup Yes We set λface = 1, λimg,1 = 8, λimg,2 = 0.2 λimg,1, and λimg,3 = 0.2 λimg,2. We use batch size N = 24 and set the confidence threshold for the distributional alignment loss C = 0.8. We train for 10k iterations using Adam W optimizer with learning rate 5e-5.