reproducibilityindex.ai

Understanding and Improving Training-free Loss-based Diffusion Guidance

Authors: Yifei Shen, XINYANG JIANG, Yifan Yang, Yezhen Wang, Dongqi Han, Dongsheng Li

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments in image and motion generation confirm the efficacy of these techniques.In this section, we evaluate the efficacy of our proposed techniques across various diffusion models and guidance conditions. We compare our methods with established baselines: Universal Guidance (UG) [2], Loss-Guided Diffusion with Monte Carlo (LGD-MC) [37], Training-Free Energy-Guided Diffusion Models (Free Do M) [49], and Manifold Preserving Guided Diffusion (MPGD) [16].
Researcher Affiliation	Collaboration	Yifei Shen1 Xinyang Jiang1 Yifan Yang1 Yezhen Wang2 Dongqi Han1 Dongsheng Li1 1Microsoft Research Asia 2National University of Singapore
Pseudocode	Yes	Algorithm 1 Random Augmentation, Algorithm 2 Polyak Step Size, Algorithm 3 Time Travel
Open Source Code	Yes	The code is available at https://github.com/BIGKnight/Understanding-Training-free-Diffusion-Guidance
Open Datasets	Yes	Specifically, we utilize the Celeb A-HQ diffusion model [19] to generate high-quality facial images. For the unconditional Image Net diffusion, we employ text guidance in line with the approach used in Free Do M and UG [2, 49]. In this subsection, we extend our evaluation to human motion generation using the Motion Diffusion Model (MDM) [40], which represents motion through a sequence of joint coordinates and is trained on a large corpus of text-motion pairs with classifier-free guidance.
Dataset Splits	No	The paper mentions using pre-trained models and datasets like Celeb A-HQ, ImageNet, and MDM, but does not provide specific train/validation/test split percentages or numbers for their experiments.
Hardware Specification	Yes	These experiments were conducted on a single NVIDIA A100 GPU.
Software Dependencies	No	The paper does not explicitly list specific version numbers for software dependencies such as Python, PyTorch, or CUDA.
Experiment Setup	Yes	We implement Polyak step size within the context of a training-free guidance framework called Free Do M [49] and benchmark the performance of this implementation using the DDIM sampler with 50 steps. For the sampling method, DDIM with 100 steps is adopted as in [49, 37]. In Free Do M and MPGD-Z, resampling is conducted for time steps ranging from 800 to 300, with the time-travel number fixed at 10, as described in [49].