Consistency Models
Authors: Yang Song, Prafulla Dhariwal, Mark Chen, Ilya Sutskever
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, we demonstrate that they outperform existing distillation techniques for diffusion models in one- and few-step sampling, achieving the new state-of-the-art FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 for one-step generation. When trained in isolation, consistency models become a new family of generative models that can outperform existing one-step, non-adversarial generative models on standard benchmarks such as CIFAR-10, ImageNet 64x64 and LSUN 256x256. |
| Researcher Affiliation | Industry | 1Open AI, San Francisco, CA 94110, USA. Correspondence to: Yang Song <songyang@openai.com>. |
| Pseudocode | Yes | Algorithm 1 Multistep Consistency Sampling, Algorithm 2 Consistency Distillation (CD), Algorithm 3 Consistency Training (CT). |
| Open Source Code | No | The paper does not include an unambiguous statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We employ consistency distillation and consistency training to learn consistency models on real image datasets, including CIFAR-10 (Krizhevsky et al., 2009), ImageNet 64x64 (Deng et al., 2009), LSUN Bedroom 256x256, and LSUN Cat 256x256 (Yu et al., 2015). |
| Dataset Splits | No | The paper references standard datasets like CIFAR-10, ImageNet, and LSUN, which have predefined splits, but does not explicitly state the training, validation, or test dataset splits (e.g., percentages or sample counts) within the paper itself. |
| Hardware Specification | Yes | We trained all models on a cluster of Nvidia A100 GPUs. |
| Software Dependencies | No | The paper mentions 'Rectified Adam optimizer' but does not provide specific version numbers for software components like programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or CUDA libraries. |
| Experiment Setup | Yes | Table 3: Hyperparameters used for training CD and CT models, which includes Learning rate, Batch size, ODE solver, EMA decay rate, Training iterations, Mixed-Precision (FP16), Dropout probability, and Number of GPUs. |