EM Distillation for One-step Diffusion Models
Authors: Sirui Xie, Zhisheng Xiao, Diederik Kingma, Tingbo Hou, Ying Nian Wu, Kevin P. Murphy, Tim Salimans, Ben Poole, Ruiqi Gao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | EMD outperforms existing one-step generative methods in terms of FID scores on Image Net-64 and Image Net-128, and compares favorably with prior work on distilling text-to-image diffusion models. |
| Researcher Affiliation | Collaboration | 1Google Deep Mind 2Google Research 3UCLA |
| Pseudocode | Yes | Algorithm 1: EM Distillation |
| Open Source Code | No | We have not open sourced the model or code, but our approach is data-free so no training data is required. We also provide implementation details in the appendix that we hope are sufficient for reproducing our results. |
| Open Datasets | Yes | We employ EMD to learn one-step image generators on Image Net 64 64, Image Net 128 128 [60] and text-to-image generation. |
| Dataset Splits | No | The paper does not explicitly state details about the validation dataset split (e.g., percentages, sample counts, or explicit mention of a validation set used in their specific experiments), beyond referencing the overall datasets. |
| Hardware Specification | Yes | We run the distillation training for 300k steps (roughly 8 days) on 64 TPU-v4. We run the distillation training for 200k steps (roughly 10 days) on 128 TPU-v5p. Our method, EMD-8, trained on 256 TPU-v5e for 5 hours (5000 steps)... |
| Software Dependencies | No | The paper discusses software components and models (e.g., Stable Diffusion v1.5, Adam optimizer) but does not list specific version numbers for software dependencies required for replication. |
| Experiment Setup | Yes | We list other hyperparameters in Table 7. We list other hyperparameters in Table 8. We list other hyperparameters in Table 9. |