Universal Guidance for Diffusion Models

Authors: Arpit Bansal, Hong-Min Chu, Avi Schwarzschild, Roni Sengupta, Micah Goldblum, Jonas Geiping, Tom Goldstein

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present results testing our proposed universal guidance algorithm against a wide variety of guidance functions. Specifically, we experiment with Stable Diffusion (Rombach et al., 2022), a diffusion model that is able to perform text-conditional generation by accepting text prompt as additional input, and experiment with a purely unconditional diffusion model trained on Image Net (Deng et al., 2009), where we use pre-trained model provided by Open AI (Dhariwal & Nichol, 2021). We first present the experiment on Stable Diffusion for different guidance functions in Sec. 4.1, and present the results on Image Net diffusion model in Sec. 4.2.
Researcher Affiliation Academia Arpit Bansal* University of Maryland bansal01@umd.edu Hong-Min Chu* University of Maryland Avi Schwarzschild University of Maryland Soumyadip Sengupta University of North Carolina Micah Goldblum New York University Jonas Geiping University of Maryland Tom Goldstein University of Maryland
Pseudocode Yes Algorithm 1 Universal Guidance
Open Source Code Yes Code is available at github.com/arpitbansal297/Universal-Guided-Diffusion.
Open Datasets Yes we experiment with Stable Diffusion (Rombach et al., 2022)... and experiment with a purely unconditional diffusion model trained on Image Net (Deng et al., 2009)
Dataset Splits No The paper uses pre-trained models (Stable Diffusion, ImageNet diffusion model) and evaluates the proposed guidance algorithm on generated images. It does not describe explicit training, validation, or test dataset splits for its experiments.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It mentions 'computational budget' but no explicit specifications.
Software Dependencies No The paper mentions software components like PyTorch, MobileNetV3-Large, and Faster-RCNN, but does not provide specific version numbers for these software dependencies, which is required for reproducibility.
Experiment Setup Yes In this section, we present the hyper-parameters for the different guidance functions i.e. face, segmentation, object location, and style guidance. We present the hyperparameters for experiments on Stable Diffusion in Sec. 4.1 in the Tab. 3, where we include coefficient s0 to compute st = s0 1 αt and the number of Universal Stepwise Refinement (k). We also provide hyperparameters for experiments on Image Net in Sec. 4.2 in Tab. 4