One-Step Diffusion Distillation via Deep Equilibrium Models
Authors: Zhengyang Geng, Ashwini Pokle, J. Zico Kolter
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the effectiveness of our proposed Generative Equilibrium Transformer (GET) in offline distillation of diffusion models through a series of experiments on single-step class-conditional and unconditional image generation. [...] We report all our results on CIFAR-10 |
| Researcher Affiliation | Collaboration | Zhengyang Geng Carnegie Mellon University zgeng2@cs.cmu.edu Ashwini Pokle Carnegie Mellon University apokle@cs.cmu.edu J. Zico Kolter Carnegie Mellon University Bosch Center for AI zkolter@cs.cmu.edu |
| Pseudocode | No | The paper describes the architecture and process in text and diagrams but does not include formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code, checkpoints, and datasets are available here. |
| Open Datasets | Yes | We report all our results on CIFAR-10 [52] |
| Dataset Splits | No | The paper uses CIFAR-10, a standard dataset, but does not explicitly mention training/validation/test splits or a validation set. |
| Hardware Specification | Yes | The entire process of data generation takes about 4 hours on 4 NVIDIA A6000 GPUs using Pytorch [74] Distributed Data Parallel (DDP) and a batch size of 128 per GPU. |
| Software Dependencies | No | The paper mentions 'PyTorch [74]' but does not specify the version number of the software. |
| Experiment Setup | Yes | We use Adam W [63] optimizer with a learning rate of 1e-4, a batch size of 128 (denoted as 1 BS), and 800k training iterations, which are identical to Progressive Distillation (PD) [88]. For conditional models, we adopt a batch size of 256 (2 BS). No warm-up, weight decay, or learning rate decay is applied. We convert input noise to patches of size 2 2. We use 6 steps of fixed point iterations in the forward pass of GET-DEQ and differentiate through it. |