reproducibilityindex.ai

Data-free Distillation of Diffusion Models with Bootstrapping

Authors: Jiatao Gu, Chen Wang, Shuangfei Zhai, Yizhe Zhang, Lingjie Liu, Joshua M. Susskind

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In the experiments, we first demonstrate the efficacy of BOOT on various challenging image generation benchmarks, including unconditional and class-conditional settings. Next, we show that the proposed method can be easily adopted to distill text-to-image diffusion models.
Researcher Affiliation	Collaboration	1Apple 2University of Pennsylvania. Correspondence to: Jiatao Gu <jiatao@apple.com>.
Pseudocode	Yes	Algorithm 1 Distillation using BOOT for Conditional Diffusion Models.
Open Source Code	No	The paper mentions using open-sourced models as teachers but does not provide any statement or link indicating that the code for their proposed method (BOOT) is open-source or publicly available.
Open Datasets	Yes	FFHQ (https://github.com/NVlabs/ffhq-dataset) contains 70k images of real human faces in resolution of 1024 1024. ... Image Net-1K (https://image-net.org/download.php) contains 1.28M images across 1000 classes. ... Specifically, we utilize diffusiondb (Wang et al., 2022), a large-scale prompt dataset that contains 14 million images generated by Stable Diffusion using prompts provided by real users. ... Diffusion DB (https://poloclub.github.io/diffusiondb/) contains 14M images generated by Stable Diffusion using prompts and hyperparameters specified by users.
Dataset Splits	Yes	For text-to-image tasks, we measure the zero-shot CLIP score (Radford et al., 2021) for measuring the faithfulness of generation given 5000 randomly sampled captions from COCO2017 (Lin et al., 2014) validation set.
Hardware Specification	Yes	In addition, we report the speed by fps on a single A100 GPU.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., "Python 3.8, PyTorch 1.9") needed to replicate the experiment.
Experiment Setup	Yes	Table 3. Hyperparameters used for training BOOT. The table includes specific details such as Denosing resolution (e.g., 64x64), Base channels (e.g., 128), Multipliers (e.g., 1,2,3,4), Bootstrapping step size (e.g., 0.04), CFG weight (e.g., 1, 5), Learning rate (e.g., 1e-4), Batch size (e.g., 128), and Training iterations (e.g., 500k).