Score-Based Generative Modeling with Critically-Damped Langevin Diffusion
Authors: Tim Dockhorn, Arash Vahdat, Karsten Kreis
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We extensively validate CLD and the novel SDE solver: (i) We show that the neural networks learnt in CLD-based SGMs are smoother than those of previous SGMs. (ii) On the CIFAR-10 image modeling benchmark, we demonstrate that CLD-based models outperform previous diffusion models in synthesis quality for similar network architectures and sampling compute budgets. (iii) We show that our novel sampling scheme for CLD significantly outperforms the popular Euler Maruyama method. (iv) We perform ablations on various aspects of CLD and find that CLD does not have difficult-to-tune hyperparameters. [...] 5 Experiments |
| Researcher Affiliation | Collaboration | Tim Dockhorn1,2,3, Arash Vahdat1 Karsten Kreis1 1NVIDIA 2University of Waterloo 3Vector Institute |
| Pseudocode | Yes | Algorithm 1 Symmetric Splitting CLD Sampler (SSCS) |
| Open Source Code | Yes | Project page and code: https://nv-tlabs.github.io/CLD-SGM. [...] we made source code to reproduce the main results of the paper publicly available, including detailed instructions; see our project page https://nv-tlabs.github.io/CLD-SGM and the code repository https://github.com/nv-tlabs/CLD-SGM. |
| Open Datasets | Yes | On the CIFAR-10 image modeling benchmark [...] additionally trained a CLD-SGM on Celeb A-HQ-256 |
| Dataset Splits | No | The paper uses standard benchmarks like CIFAR-10 and Celeb A-HQ-256, which have predefined splits, but it does not explicitly state the percentages or sample counts for training, validation, or test sets in the main text. It refers to these standard setups without providing the specific numerical split details. |
| Hardware Specification | No | Table 6 in Appendix E.2.1 states 'Batch size per GPU 8' and '# of GPUs 16', implying the use of GPUs. However, it does not specify the exact model of GPUs (e.g., NVIDIA A100) or any other detailed hardware specifications like CPU models or memory. |
| Software Dependencies | No | The paper mentions 'provided PyTorch code' in Appendix E.2.2 but does not specify a version number for PyTorch or any other software dependencies with their versions. |
| Experiment Setup | Yes | Table 6: Model architectures as well as SDE and training setups for our experiments on CIFAR-10 and Celeb A-HQ-256. [...] Learning rate 2e-4, Gradient norm clipping 1.0, Dropout 0.1, Batch size per GPU 8, # of GPUs 16, M 0.25, gamma 0.04, beta 4, epsilon_num 1e-9. |