Finetuning Text-to-Image Diffusion Models for Fairness
Authors: Xudong Shen, Chao Du, Tianyu Pang, Min Lin, Yongkang Wong, Mohan Kankanhalli
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, our method markedly reduces gender, racial, and their intersectional biases for occupational prompts. |
| Researcher Affiliation | Collaboration | 1ISEP programme, NUS Graduate School, National University of Singapore 2Sea AI Lab, Singapore 3School of Computing, National University of Singapore xudong.shen@u.nus.edu; {duchao, tianyupang, linmin}@sea.com; yongkang.wong@nus.edu.sg; mohan@comp.nus.edu.sg |
| Pseudocode | Yes | We show implementation of adjusted DFT in Algorithm A.1. |
| Open Source Code | Yes | We share code and various fair diffusion model adaptors at https://sail-sg.github.io/finetune-fair-diffusion/. |
| Open Datasets | Yes | In our implementation, we use the Celeb A (Liu et al., 2015) and the Fair Face dataset (Karkkainen & Joo, 2021) as external faces. |
| Dataset Splits | Yes | We have another 10 occupations used for validation: ['housekeeping cleaner', 'freelance writer', 'lieutenant', 'fine artist', 'administrative law judge', 'librarian', 'sale', 'anesthesiologist', 'secondary school teacher', 'dancer']. |
| Hardware Specification | Yes | The finetuning takes around 48 hours on 8 NVIDIA A100 GPUs. |
| Software Dependencies | No | The paper mentions using 'DPM-Solver++ (Lu et al., 2022) as the diffusion scheduler' but does not specify its version number or any other software dependencies with their respective versions. |
| Experiment Setup | Yes | We set λface = 1, λimg,1 = 8, λimg,2 = 0.2 λimg,1, and λimg,3 = 0.2 λimg,2. We use batch size N = 24 and set the confidence threshold for the distributional alignment loss C = 0.8. We train for 10k iterations using Adam W optimizer with learning rate 5e-5. |