Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention
Authors: Susung Hong
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, SEG achieves a Pareto improvement in both quality and the reduction of side effects. We validate the effectiveness of SEG throughout the various experiments without and with text conditions, and Control Net [51] trained on canny and depth maps. |
| Researcher Affiliation | Academia | Susung Hong University of Washington |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/Susung Hong/SEG-SDXL. |
| Open Datasets | Yes | We use various metrics to evaluate quality (FID [10] and CLIP score [37], calculated with 30k references from the MS-COCO 2014 validation set [28]) and to assess the extent of change due to applied guidance (LPIPSvgg, alex [52]). |
| Dataset Splits | Yes | We use various metrics to evaluate quality (FID [10] and CLIP score [37], calculated with 30k references from the MS-COCO 2014 validation set [28]) and to assess the extent of change due to applied guidance (LPIPSvgg, alex [52]). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using specific schedulers (Euler discrete scheduler, DDIM scheduler) but does not list general software dependencies with version numbers (e.g., PyTorch version, Python version). |
| Experiment Setup | Yes | We set γseg to 3.0, except in the ablation study. For SEG and PAG sampling, we use the Euler discrete scheduler [21], while for SAG [17], we instead use the DDIM scheduler [45] since the current implementation of SAG does not support the Euler discrete sampler. For SAG and PAG, we use the same configurations they used in the experiments with the previous version of Stable Diffusion, with guidance scales of 1.0 and 3.0, respectively. For the results, we use σ {1, 2, 5, 10}. |