Leveraging Drift to Improve Sample Complexity of Variance Exploding Diffusion Models
Authors: Ruofeng Yang, Zhijie Wang, Bo Jiang, Shuai Li
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also show that the drifted VESDE can balance different error terms and improve generated samples without training through synthetic and real-world experiments. |
| Researcher Affiliation | Academia | Ruofeng Yang John Hopcroft Center for Computer Science Shanghai Jiao Tong University wanshuiyin@sjtu.edu.cn Zhijie Wang John Hopcroft Center for Computer Science Shanghai Jiao Tong University violetevergarden@sjtu.edu.cn Bo Jiang John Hopcroft Center for Computer Science Shanghai Jiao Tong University bjiang@sjtu.edu.cn Shuai Li John Hopcroft Center for Computer Science Shanghai Jiao Tong University shuaili8@sjtu.edu.cn |
| Pseudocode | No | The paper describes methods using mathematical formulations and prose, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The NeurIPS Paper Checklist states 'No' for 'Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material?' with the justification: 'As a theoretical work, we only do simple synthetic experiments to support our results. All detail and the used checkpoint are shown in Appendix G.' |
| Open Datasets | Yes | Datasets. The 1-D GMM distribution contains three modes: 3/10N(-8, 0.01) + 3/10N(-4, 0.01) + 4/10N(3, 1). For multiple Swiss rolls, we use a similar code compared to Listing 2 of Lai et al. [2023], except Line 6. We change Line 6. to data /=10. to obtain a larger variance dataset. Each dataset contains 50000 datapoints. The implementable algorithm. In this subsection, we choose two forward processes: (1) conservative βt = 1 with τ = T; (2) pure VESDE without drift term (Equation (2)) with σ2 t = t. To match our analysis, we choose two sampling methods for the reverse process: Euler-Maruyama method for reverse SDE and RK45 ODE solver for the reverse PFODE method. For Celeb A datasets (size: 256 256 3). More specifically, we use ve/celebahq_256_ncsnpp_continuous checkpoints provided by [Song et al., 2020b] |
| Dataset Splits | No | The paper mentions training data, but does not explicitly provide training/test/validation dataset splits or sample counts for reproducibility. |
| Hardware Specification | Yes | The above experiments are conduct on a Ge Force RTX 4090. |
| Software Dependencies | No | The paper mentions software components like the 'RK45 ODE solver' and implicitly relies on standard machine learning libraries, but it does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We train for 200 epochs with batch size 200 and learning rate 10^-4. For both training and inference, the start time is δ = 10^-5. |