Improved Sample Complexity Bounds for Diffusion Model Training

Authors: Shivam Gupta, Aditya Parulekar, Eric Price, Zhiyang Xun

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work, we focus on understanding the sample complexity of training such a model; how many samples are needed to learn an accurate diffusion model using a sufficiently expressive neural network? Prior work [BMR20] showed bounds polynomial in the dimension, desired Total Variation error, and Wasserstein error. We show an exponential improvement in the dependence on Wasserstein error and depth, along with improved dependencies on other relevant parameters.
Researcher Affiliation Academia Shivam Gupta UT Austin shivamgupta@utexas.edu Aditya Parulekar UT Austin adityaup@cs.utexas.edu Eric Price UT Austin ecprice@cs.utexas.edu Zhiyang Xun UT Austin zxun@cs.utexas.edu
Pseudocode Yes Algorithm 1 Empirical score estimation for s
Open Source Code No We do not have any experiments. We have a small simulation that is just a illustration of a lower bound, which is very small in scope.
Open Datasets No The paper refers to training with "m i.i.d. samples xi q0" from an abstract distribution q0, but does not specify any named public datasets or provide access information for specific data used in experiments.
Dataset Splits No The paper does not conduct empirical studies with specific datasets, therefore, it does not specify any training/test/validation dataset splits.
Hardware Specification No The paper focuses on theoretical analysis and does not describe any specific hardware used for running experiments.
Software Dependencies No The paper is theoretical and does not specify software dependencies with version numbers used for experiments.
Experiment Setup No The paper focuses on theoretical analysis and does not include details about an experimental setup, hyperparameters, or system-level training settings.