Spontaneous symmetry breaking in generative diffusion models

Authors: Gabriel Raya, Luca Ambrogioni

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using both theoretical and empirical evidence, we show that an accurate simulation of the early dynamics does not significantly contribute to the final generation, since early fluctuations are reverted to the central fixed point. To leverage this insight, we propose a Gaussian late initialization scheme, which significantly improves model performance, achieving up to 3x FID improvements on fast samplers, while also increasing sample diversity (e.g., racial composition of generated Celeb A images). In this section, we present empirical evidence demonstrating the occurrence of the spontaneous symmetry breaking phenomenon in diffusion models across a range of realistic image datasets, including MNIST, CIFAR-10, Celeb A 32x32, Imagenet 64x64 and Celeb A 64x64.
Researcher Affiliation Academia 1Jheronimus Academy of Data Science 2Tilburg University 3Radboud University 4Donders Institute for Brain, Cognition and Behaviour
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code can be found at https://github.com/gabrielraya/symmetry_breaking_diffusion_models
Open Datasets Yes We trained diffusion models in discrete time (DDPM)... across a range of realistic image datasets, including MNIST, CIFAR-10, Celeb A 32x32, Imagenet 64x64 and Celeb A 64x64.
Dataset Splits No The paper mentions using well-known datasets like MNIST, CIFAR-10, and Celeb A for training and evaluation. However, it does not explicitly provide specific details on the training, validation, and test dataset splits, such as percentages or sample counts, nor does it explicitly state the use of predefined standard splits with citations within the main text.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies No The paper mentions using specific models and libraries such as 'DDPM', 'DDIMs', 'PNDM', and 'deep face library' (Serengil and Ozpinar, 2020), but it does not provide specific version numbers for these or any other ancillary software dependencies.
Experiment Setup Yes We trained diffusion models in discrete time (DDPM)... with a time horizon of T = 1000... evaluated fast samplers using Denoising Diffusion Implicit Models (DDIMs) and Pseudo-numerical Methods for Diffusion Models (PNDM). Results are presented for 3, 5, and 10 denoising steps (denoted as n ). The DDPM was initialized with the common standard initialization point sstart = 800 for 5 steps and sstart = 900 for 10 steps. Notably, our Gaussian late start initialization (gls-DDPM) with sstart = 400 for both 5 and 10 denoising steps...