Limitations of Face Image Generation
Authors: Harrison Rosenberg, Shimaa Ahmed, Guruprasad Ramesh, Kassem Fawaz, Ramya Korlakai Vinayak
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Utilizing a combination of qualitative and quantitative measures, including embedding-based metrics and user studies, we present a framework to audit the characteristics of generated faces conditioned on a set of social attributes. We applied our framework on faces generated through state-of-the-art text-to-image diffusion models. |
| Researcher Affiliation | Academia | Electrical and Computer Engineering Department University of Wisconsin Madison hrosenberg@ece.wisc.edu, {ahmed27,viswanathanr,kfawaz}@wisc.edu, ramya@ece.wisc.edu |
| Pseudocode | No | The paper describes a data generation pipeline with a diagram, but it does not contain explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our survey data and analytics code can be found online at https://github.com/wi-pi/Limitations of Face Generation |
| Open Datasets | Yes | We utilize the Labeled Faces in the Wild (LFW) dataset as a baseline for natural faces verification. LFW is a canonical dataset for face recognition tasks. The LFW dataset contains 13233 images and a total of 5749 unique identities. Demographic annotations for images in LFW were obtained from the system introduced by Kumar et al. (Kumar et al. 2009). |
| Dataset Splits | No | The paper does not explicitly provide training/test/validation dataset splits for the data used in their experiments, nor does it specify how standard datasets like LFW were split for their evaluation. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., GPU models, CPU models) used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like CLIP, DINO-v2, Facenet, and GPT-3.5, but it does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For Realism, we experimented with a set of prompts... A photo of the face of {identity}. We vary the TTI generator seed to generate multiple images per identity and prompt. We also add a set of negative prompts... For SDv2.1... A photo of the face of ({identity}:2.0). (realistic:2.0). (Face shot only:2.0). ...All the synthesized images are of 512 512 resolution. For SDv2.1, to ensure better quality, we generate the images at 768 768 and then downsample them to 512 512. |