Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Consistency Models
Authors: Yang Song, Prafulla Dhariwal, Mark Chen, Ilya Sutskever
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, we demonstrate that they outperform existing distillation techniques for diffusion models in one- and few-step sampling, achieving the new state-of-the-art FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 for one-step generation. When trained in isolation, consistency models become a new family of generative models that can outperform existing one-step, non-adversarial generative models on standard benchmarks such as CIFAR-10, ImageNet 64x64 and LSUN 256x256. |
| Researcher Affiliation | Industry | 1Open AI, San Francisco, CA 94110, USA. Correspondence to: Yang Song <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Multistep Consistency Sampling, Algorithm 2 Consistency Distillation (CD), Algorithm 3 Consistency Training (CT). |
| Open Source Code | No | The paper does not include an unambiguous statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We employ consistency distillation and consistency training to learn consistency models on real image datasets, including CIFAR-10 (Krizhevsky et al., 2009), ImageNet 64x64 (Deng et al., 2009), LSUN Bedroom 256x256, and LSUN Cat 256x256 (Yu et al., 2015). |
| Dataset Splits | No | The paper references standard datasets like CIFAR-10, ImageNet, and LSUN, which have predefined splits, but does not explicitly state the training, validation, or test dataset splits (e.g., percentages or sample counts) within the paper itself. |
| Hardware Specification | Yes | We trained all models on a cluster of Nvidia A100 GPUs. |
| Software Dependencies | No | The paper mentions 'Rectified Adam optimizer' but does not provide specific version numbers for software components like programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or CUDA libraries. |
| Experiment Setup | Yes | Table 3: Hyperparameters used for training CD and CT models, which includes Learning rate, Batch size, ODE solver, EMA decay rate, Training iterations, Mixed-Precision (FP16), Dropout probability, and Number of GPUs. |