Universal Guidance for Diffusion Models
Authors: Arpit Bansal, Hong-Min Chu, Avi Schwarzschild, Roni Sengupta, Micah Goldblum, Jonas Geiping, Tom Goldstein
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present results testing our proposed universal guidance algorithm against a wide variety of guidance functions. Specifically, we experiment with Stable Diffusion (Rombach et al., 2022), a diffusion model that is able to perform text-conditional generation by accepting text prompt as additional input, and experiment with a purely unconditional diffusion model trained on Image Net (Deng et al., 2009), where we use pre-trained model provided by Open AI (Dhariwal & Nichol, 2021). We first present the experiment on Stable Diffusion for different guidance functions in Sec. 4.1, and present the results on Image Net diffusion model in Sec. 4.2. |
| Researcher Affiliation | Academia | Arpit Bansal* University of Maryland bansal01@umd.edu Hong-Min Chu* University of Maryland Avi Schwarzschild University of Maryland Soumyadip Sengupta University of North Carolina Micah Goldblum New York University Jonas Geiping University of Maryland Tom Goldstein University of Maryland |
| Pseudocode | Yes | Algorithm 1 Universal Guidance |
| Open Source Code | Yes | Code is available at github.com/arpitbansal297/Universal-Guided-Diffusion. |
| Open Datasets | Yes | we experiment with Stable Diffusion (Rombach et al., 2022)... and experiment with a purely unconditional diffusion model trained on Image Net (Deng et al., 2009) |
| Dataset Splits | No | The paper uses pre-trained models (Stable Diffusion, ImageNet diffusion model) and evaluates the proposed guidance algorithm on generated images. It does not describe explicit training, validation, or test dataset splits for its experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It mentions 'computational budget' but no explicit specifications. |
| Software Dependencies | No | The paper mentions software components like PyTorch, MobileNetV3-Large, and Faster-RCNN, but does not provide specific version numbers for these software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | In this section, we present the hyper-parameters for the different guidance functions i.e. face, segmentation, object location, and style guidance. We present the hyperparameters for experiments on Stable Diffusion in Sec. 4.1 in the Tab. 3, where we include coefficient s0 to compute st = s0 1 αt and the number of Universal Stepwise Refinement (k). We also provide hyperparameters for experiments on Image Net in Sec. 4.2 in Tab. 4 |