Soft Mixture Denoising: Beyond the Expressive Bottleneck of Diffusion Models

Authors: Yangming Li, Boris van Breugel, Mihaela van der Schaar

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on multiple image datasets show that SMD significantly improves different types of diffusion models (e.g., DDPM), espeically in the situation of few backward iterations. and 5 EXPERIMENTS Let us verify how SMD improves the quality and speed of existing diffusion models. First, we use a toy example to visualise that existing diffusion models struggle to learn multivariate Gaussians, whereas SMD does not. Subsequently, we show how SMD significantly improves the FID score across different types of diffusion models (e.g., DDPM, ADM (Dhariwal & Nichol, 2021), and LDM) and datasets.
Researcher Affiliation Academia Yangming Li, Boris van Breugel, Mihaela van der Schaar Department of Applied Mathematics and Theoretical Physics University of Cambridge yl874@cam.ac.uk
Pseudocode Yes Algorithm 1 Training and Algorithm 2 Sampling
Open Source Code Yes The source code of this work is publicly available at a personal repository: https://github.com/louisli321/smd, and our lab repository: https://github.com/vanderschaarlab/smd.
Open Datasets Yes Datasets include CIFAR-10 (Krizhevsky et al., 2009), LSUN-Conference, LSUN-Church (Yu et al., 2015), and Celeb A-HQ (Liu et al., 2015).
Dataset Splits No The paper mentions using datasets for evaluation but does not provide specific details on training, validation, or test splits (e.g., percentages or counts for each split).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions various models and networks (e.g., U-Net) and cites frameworks, but does not list specific software dependencies with version numbers required for reproduction.
Experiment Setup No The paper mentions some parameters like 'T = 1000' and '100 backward iterations', but it lacks a comprehensive description of the experimental setup, including specific hyperparameters like learning rate, batch size, or optimizer settings.