reproducibilityindex.ai

On the Generalization Properties of Diffusion Models

Authors: Puheng Li, Zhong Li, Huishuai Zhang, Jiang Bian

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our findings contribute to the rigorous understanding of diffusion models generalization properties and provide insights that may guide practical applications. ... Furthermore, these estimates are not solely theoretical constructs but have also been confirmed through numerical simulations. Our findings contribute to the rigorous understanding of diffusion models generalization properties and provide insights that may guide practical applications.
Researcher Affiliation	Collaboration	Puheng Li* Department of Statistics Stanford University puhengli@stanford.edu Zhong Li* Machine Learning Group Microsoft Research Asia lzhong@microsoft.com Huishuai Zhang Machine Learning Group Microsoft Research Asia huzhang@microsoft.com Jiang Bian Machine Learning Group Microsoft Research Asia jiabia@microsoft.com
Pseudocode	No	The paper provides mathematical derivations and descriptions of the model but does not include any pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/lphLeo/Diffusion_Generalization
Open Datasets	Yes	In this subsection, we verify our results on the MNIST dataset using the standard U-net architecture as the score network, which suggests that the adverse effect of modes shift on the generalization performance of diffusion models also appears in general.
Dataset Splits	No	The paper mentions training on datasets but does not explicitly provide details about training/validation/test splits (e.g., percentages or counts).
Hardware Specification	No	The paper does not provide any specific details about the hardware used for the experiments.
Software Dependencies	No	The paper mentions using 'SGD optimizer' and 'Adam optimizer', 'U-net architecture', and 'one-hidden-layer neural network with Swish activations', but does not specify version numbers for any software or libraries.
Experiment Setup	Yes	We select the one-hidden-layer neural network with Swish activations as the score network, which is trained using the SGD optimizer with a fixed learning rate 0.5. The target distribution is set to be a onedimensional 2-mode Gaussian mixture with the modes distance equalling 6, and the number of data samples is 1000. ... All the configurations remain the same as Section 4.1 except that the learning rate is now 10 3.