Diffusion Self-Guidance for Controllable Image Generation

Authors: Dave Epstein, Allan Jabri, Ben Poole, Alexei Efros, Aleksander Holynski

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental All experiments were performed on Imagen [32], producing 1024x1024 samples.
Researcher Affiliation Collaboration Dave Epstein1,2 Allan Jabri1 Ben Poole2 Alexei A. Efros1 Aleksander Holynski1,2 1UC Berkeley 2Google Research
Pseudocode No The paper contains mathematical equations and descriptive text for its method, but no explicit pseudocode or algorithm blocks.
Open Source Code No The paper mentions a project page for 'results and an interactive demo' (https://dave.ml/selfguidance) but does not explicitly state that the source code for the methodology described in the paper is released or available at this link or elsewhere.
Open Datasets No The paper states that experiments were performed on Imagen [32], which is a pre-trained model, but does not provide specific access information (link, DOI, or explicit citation for dataset authors/year) for any dataset used in their experiments, nor for the 'real images' mentioned.
Dataset Splits No The paper does not provide specific details on training, validation, or test dataset splits for the experiments conducted using self-guidance. Mentions of training pertain to the underlying diffusion model used (Imagen), not the data splits for their own evaluation.
Hardware Specification No The paper states 'All experiments were performed on Imagen [32]' but does not provide specific hardware details such as GPU models, CPU types, or memory used for running its experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or CUDA versions) required to replicate the experiments.
Experiment Setup Yes We apply our self-guidance term following best practices for classifier-free guidance on Imagen [32]. Specifically, where N is the number of DDPM steps, we take the first 3N/16 steps with self-guidance and the last N/32 without. The remaining 25N/32 steps are alternated between using self-guidance and not using it. We use N = 1024 steps. Our method works with 256 and 512 steps as well, though self-guidance weights occasionally require adjustment. We set v = 7500 in Eqn. 4 as an overall scale for gradients of the functions g defined below we find that the magnitude of per-pixel gradients is quite small (often in the range of 10 7 to 10 6, so such a high weight is needed to induce changes.