Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
SCott: Accelerating Diffusion Models with Stochastic Consistency Distillation
Authors: Hongjian Liu, Qingsong Xie, Tianxiang Ye, Zhijie Deng, Chen Chen, Shixiang Tang, Xueyang Fu, Haonan Lu, Zheng-Jun Zha
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, on the MSCOCO-2017 5K dataset with a Stable Diffusion-V1.5 teacher, SCott achieves an FID of 21.9 with 2 sampling steps, surpassing that of the 1-step Insta Flow (23.4) and the 4-step UFOGen (22.1). Moreover, SCott can yield more diverse samples than other consistency models for high-resolution image generation, with up to 16% improvement in a qualified metric. |
| Researcher Affiliation | Collaboration | Hongjian Liu1*, Qingsong Xie2* , Tianxiang Ye3, Zhijie Deng3 , Chen Chen2, Shixiang Tang4, Xueyang Fu1, Haonan Lu2, Zheng-Jun Zha1 1 University of Science and Technology of China, China 2 OPPO AI Center 3 Shanghai Jiao Tong University, China 4 The Chinese University of Hong Kong |
| Pseudocode | No | The paper describes the methodology in prose and through mathematical equations and diagrams (Figure 2), but does not include any explicitly labeled pseudocode or algorithm blocks in the main text. |
| Open Source Code | No | The paper does not provide an explicit statement about the release of their source code or a link to a code repository. |
| Open Datasets | Yes | Empirically, on the MSCOCO-2017 5K dataset with a Stable Diffusion-V1.5 teacher, SCott achieves an FID of 21.9 with 2 sampling steps... We use LAION-Aesthetics-6+ dataset (Schuhmann et al. 2022). |
| Dataset Splits | Yes | Empirically, on the MSCOCO-2017 5K dataset with a Stable Diffusion-V1.5 teacher, SCott achieves an FID of 21.9 with 2 sampling steps... On MSCOCO2017 5K validation dataset with a Stable Diffusion-V1.5 (SD1.5) (Rombach et al. 2022) teacher, our 2-step method achieves an FID (Heusel et al. 2017) of 21.9... Comparison on MSCOCO-2014 30K... Comparison on MJHQ-5K validation dataset... |
| Hardware Specification | Yes | We train SCott with 4 A100 GPUs and a batch size of 40 for 40K iterations. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers, such as Python or PyTorch versions, that would be needed to replicate the experiment. |
| Experiment Setup | Yes | We train SCott with 4 A100 GPUs and a batch size of 40 for 40K iterations. The learning rate is 8e-6 for SCott and 2e-5 for the discriminator. In practice, we set λadv = 0.4 to control the strength of the discriminator for refining the outputs of fθ. Empirically, we set tm = tn 24 and h = 3. |