TFG: Unified Training-Free Guidance for Diffusion Models
Authors: Haotian Ye, Haowei Lin, Jiaqi Han, Minkai Xu, Sheng Liu, Yitao Liang, Jianzhu Ma, James Y. Zou, Stefano Ermon
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We systematically benchmark across 7 diffusion models on 16 tasks with 40 targets, and improve performance by 8.5% on average. |
| Researcher Affiliation | Academia | 1Stanford University 2Peking University 3Tsinghua University |
| Pseudocode | Yes | Algorithm 1 Training-Free Guidance |
| Open Source Code | Yes | 1Code is available at https://github.com/YWolfeee/Training-Free-Guidance. |
| Open Datasets | Yes | We conduct a case study on CIFAR10 [30]... (1) CIFAR10-DDPM [48] is a U-Net [54] model trained on CIFAR10 [30] images. (2) Image Net-DDPM [7] is an larger U-Net model trained on Image Net-1k [55] images. (3) Cat-DDPM is trained on Cat [12] images. (4) Celeb A-DDPM is trained on Celeb A-HQ dataset [26]... (5) Molecule-EDM [24] is an equivariant diffusion model pretrained on molecule dataset QM9 [50]... |
| Dataset Splits | Yes | For dataset, we employ QM9 [50] and adopt the split in [24] with 100,000 training samples. Following [24] and [3], the training set is further split into two halves that guarantees there is no data leakage. The first half is leveraged to train a property prediction network... The second half is used to train the diffusion model as well as the guidance network. |
| Hardware Specification | Yes | We run most of the experiments on clusters using NVIDIA A100s. |
| Software Dependencies | No | We implemented our experiments using Py Torch [49] and the Hugging Face library. (Appendix E.5) |
| Experiment Setup | Yes | We consistently set the time step T = 100 and the DDIM parameter η = 1. We consider Nrecur = 1, Niter = 4 and use a single sample for Implicit Dynamic (Line 4) throughout all experiments and methods for fair comparison. For TFG, the structures of ρ and µ are set to increase and the scalars ρ, µ, γ are determined via our searching strategy. |