reproducibilityindex.ai

Improved Distribution Matching Distillation for Fast Image Synthesis

Authors: Tianwei Yin, Michaël Gharbi, Taesung Park, Richard Zhang, Eli Shechtman, Fredo Durand, Bill Freeman

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Experiments We evaluate our approach, DMD2, using several benchmarks, including class-conditional image generation on Image Net-64 64 [62], and text-to-image synthesis on COCO 2014 [63] with various teacher models [1,58]. We use the Fréchet Inception Distance (FID) [60] to measure image quality and diversity, and the CLIP Score [64] to evaluate text-to-image alignment.
Researcher Affiliation	Collaboration	1Massachusetts Institute of Technology 2Adobe Research
Pseudocode	Yes	Algorithm 1: DMD (original) Input: Pretrained real diffusion model µreal, paired ODE solution pairs D = {zref, yref} Output: Trained generator G
Open Source Code	Yes	We release our code and pretrained models.
Open Datasets	Yes	Our generators are trained by distilling SDXL [58] and SD v1.5 [1], respectively, using a subset of 3 million prompts from LAION-Aesthetics [59].
Dataset Splits	No	The paper mentions using 'COCO 2014 validation set' for evaluation purposes, but does not specify explicit train/validation/test splits for the datasets used for training (e.g., LAION-Aesthetics) nor a validation split used during model training.
Hardware Specification	Yes	We use a batch size of 280 and train the model on 7 A100 GPUs for 200K iterations
Software Dependencies	No	The paper mentions using the Adam W optimizer but does not specify software versions for libraries, frameworks, or programming languages used (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	For the standard training setup, we use the Adam W optimizer [88] with a learning rate of 2 10 6, a weight decay of 0.01, and beta parameters (0.9, 0.999). We use a batch size of 280 and train the model on 7 A100 GPUs for 200K iterations, which takes approximately 2 days. The number of fake diffusion model update per generator update is set to 5. The weight for the GAN loss is set to 3 10 3.