Adapting to Unknown Low-Dimensional Structures in Score-Based Diffusion Models

Authors: Gen Li, Yuling Yan

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This paper investigates score-based diffusion models when the underlying target distribution is concentrated on or near low-dimensional manifolds... (Introduction - theoretical focus). Section 4 is titled “Analysis for the DDPM sampler (Proof of Theorem 1)”. Section 5 is “Simulation study” but it’s a theoretical simulation: “We conducted a simple simulation to compare our coefficient design (2.4) with another design... We consider the degenerated Gaussian distribution pdata = N(0, Ik) in Theorem 2 as a tractable example, and run the DDPM sampler with exact score functions (so that the error only comes from discretization).” The NeurIPS checklist answers “NA” for all experiment-related questions, explicitly stating “The paper does not include experiments.”
Researcher Affiliation Academia Gen Li The Chinese University of Hong Kong genli@cuhk.edu.hk Yuling Yan University of Wisconsin-Madison yuling.yan@wisc.edu
Pseudocode No The paper describes the forward and reverse processes using mathematical equations (e.g., (1.1), (1.2), (2.1), (2.3)) but does not include a clearly labeled pseudocode or algorithm block.
Open Source Code No The NeurIPS Paper Checklist for 'Open access to data and code' states 'The answer NA means that paper does not include experiments requiring code. ... While we encourage the release of code and data, we understand that this might not be possible, so No is an acceptable answer. Papers cannot be rejected simply for not including code, unless this is central to the contribution (e.g., for a new open-source benchmark).' The paper does not provide any statement or link for open-source code for its methodology.
Open Datasets No For example, for two widely used image datasets, CIFAR-10 (dimension d = 32 × 32 × 3) and Image Net (dimension d ≈ 64 × 64 × 3), it is known that 50 and 250 steps (also known as NFE, the number of function evaluations) are sufficient to generate good samples [16, 9]. (These are mentioned as examples, not used for actual experiments in this paper, which is theoretical). We consider the degenerated Gaussian distribution pdata = N(0, Ik) in Theorem 2 as a tractable example, and run the DDPM sampler with exact score functions (so that the error only comes from discretization). (This is a theoretical distribution used in a simulation, not a publicly available dataset). The NeurIPS checklist also confirms 'The paper does not include experiments.'
Dataset Splits No The NeurIPS Paper Checklist states 'The paper does not include experiments.' The simulation study uses a theoretical distribution, not a dataset with defined train/validation/test splits.
Hardware Specification No The paper is theoretical and its 'Simulation study' section does not specify any hardware used for the simulation.
Software Dependencies No The paper is theoretical and its 'Simulation study' section does not specify any software dependencies with version numbers.
Experiment Setup Yes We fix the intrinsic dimension k = 8, and let the ambient dimension d grow from 10 to 103. We implement the experiment for four different number of steps T ∈ {100, 200, 500, 1000}. Instead of using the learning rate schedule (2.5), which is chosen mainly to facilitate analysis, we use the schedule in [11] that is commonly used in practice.