Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention

Authors: Susung Hong

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments, SEG achieves a Pareto improvement in both quality and the reduction of side effects. We validate the effectiveness of SEG throughout the various experiments without and with text conditions, and Control Net [51] trained on canny and depth maps.
Researcher Affiliation Academia Susung Hong University of Washington
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes The code is available at https://github.com/Susung Hong/SEG-SDXL.
Open Datasets Yes We use various metrics to evaluate quality (FID [10] and CLIP score [37], calculated with 30k references from the MS-COCO 2014 validation set [28]) and to assess the extent of change due to applied guidance (LPIPSvgg, alex [52]).
Dataset Splits Yes We use various metrics to evaluate quality (FID [10] and CLIP score [37], calculated with 30k references from the MS-COCO 2014 validation set [28]) and to assess the extent of change due to applied guidance (LPIPSvgg, alex [52]).
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No The paper mentions using specific schedulers (Euler discrete scheduler, DDIM scheduler) but does not list general software dependencies with version numbers (e.g., PyTorch version, Python version).
Experiment Setup Yes We set γseg to 3.0, except in the ablation study. For SEG and PAG sampling, we use the Euler discrete scheduler [21], while for SAG [17], we instead use the DDIM scheduler [45] since the current implementation of SAG does not support the Euler discrete sampler. For SAG and PAG, we use the same configurations they used in the experiments with the previous version of Stable Diffusion, with guidance scales of 1.0 and 3.0, respectively. For the results, we use σ {1, 2, 5, 10}.