Fair Generative Models via Transfer Learning

Authors: Christopher T.H. Teo, Milad Abdollahzadeh, Ngai-Man Cheung

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that fair TL and fair TL++ achieve state-of-the-art in both quality and fairness of generated samples. The code and additional resources can be found at bearwithchris.github.io/fair TL/
Researcher Affiliation Academia Singapore University of Technology and Design (SUTD) christopher teo@mymail.sutd.edu.sg, {milad abdollahzadeh, ngaiman cheung}@sutd.edu.sg
Pseudocode No The paper describes the methods verbally and with equations (e.g., Eqn. 1, Eqn. 2) but does not include formal pseudocode or algorithm blocks.
Open Source Code Yes The code and additional resources can be found at bearwithchris.github.io/fair TL/
Open Datasets Yes Dataset. We consider the datasets Celeb A (Liu et al. 2015) and UTKFace (Zhang, Song, and Qi 2017) for this experiment.
Dataset Splits No The paper discusses the ratios of Dref to Dbias (e.g., 'perc = {0.25, 0.1, 0.05, 0.025}') but does not explicitly state conventional train/validation/test dataset splits in percentages or specific counts for reproducibility.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for the experiments.
Software Dependencies No The paper mentions using BIGGAN and StyleGAN2 but does not provide specific version numbers for software dependencies or libraries (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes Eqn. 2 presents the loss function, where we utilize λ [0, 1] as a hyper-parameter to control the balance between enforcing fairness and quality. In our experiments, we found that although both discriminators play an essential part in improving the performance of the GAN, more emphasis should be placed on Dt. In particular, since Ds is frozen, making λ too small results in instability during training. Conversely, making λ too big limits the feedback we get on the sample s quality. Empirically, we found λ = 0.6 to be ideal.