SEGA: Instructing Text-to-Image Models using Semantic Guidance

Authors: Manuel Brack, Felix Friedrich, Dominik Hintersdorf, Lukas Struppek, Patrick Schramowski, Kristian Kersting

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experimental Evaluation
Researcher Affiliation Collaboration Manuel Brack1,2 Felix Friedrich2,3 Dominik Hintersdorf2 Lukas Struppek2 Patrick Schramowski1,2,3,4 Kristian Kersting1,2,3,5 1German Research Center for Artificial Intelligence (DFKI), 2Computer Science Department, TU Darmstadt 3Hessian.AI, 4LAION, 5Centre for Cognitive Science, TU Darmstadt
Pseudocode Yes Additionally, we also provide the pseudo-code notation of SEGA in Alg 1.
Open Source Code Yes Implementation available in diffusers: https://huggingface.co/docs/diffusers/api/pipelines/semantic_stable_diffusion
Open Datasets Yes This setting is inspired by the Celeb A dataset [17] and marks a well-established benchmark for semantic changes in image generation. and Utilizing the facial images from the previous experiment, we calculated FID scores against a reference dataset of FFHQ [11].
Dataset Splits No The paper utilizes pre-trained models (Stable Diffusion, Paella, Deep Floyd-IF) and conducts user studies on generated images, thus it does not specify training/test/validation dataset splits for its own experimental process.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run its experiments.
Software Dependencies No The paper mentions building on the 'diffusers library' and using 'Stable Diffusion v1.5' but does not provide specific version numbers for software dependencies or programming languages.
Experiment Setup Yes Let us provide some more detailed intuition for each of SEGA s hyper-parameters Scale se. Threshold λ. Warmup δ. Momentum. and All images are generated from the same original image (shown in Fig. 10) obtained by the prompt a house at a lake.