Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising
Authors: Zigeng Chen, Xinyin Ma, Gongfan Fang, Zhenxiong Tan, Xinchao Wang
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validated the broad applicability of Async Diff through extensive testing on several diffusion models. For text-to-image tasks, we experimented with three versions of Stable Diffusion: SD 1.5, SD 2.1 [43], and Stable Diffusion XL (SDXL) [41]. Additionally, we explored the effectiveness of Async Diff on video diffusion models using Stable Video Diffusion (SVD) [2] and Animate Diff [9]. |
| Researcher Affiliation | Academia | Zigeng Chen, Xinyin Ma, Gongfan Fang, Zhenxiong Tan, Xinchao Wang National University of Singapore EMAIL, EMAIL |
| Pseudocode | No | The paper contains diagrams illustrating the asynchronous denoising process but no formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/czg1225/Async Diff |
| Open Datasets | Yes | We assess the zero-shot generation capability using the MS-COCO 2017 [29] validation set, which comprises 5,000 images and captions. |
| Dataset Splits | Yes | We assess the zero-shot generation capability using the MS-COCO 2017 [29] validation set, which comprises 5,000 images and captions. |
| Hardware Specification | Yes | All latency measurements were conducted on NVIDIA A5000 GPUs equipped with NVLINK Bridge. We tested inference speeds on the professional-grade NVIDIA RTX A5000, as well as the consumer-grade NVIDIA RTX 2080 Ti and NVIDIA RTX 3090 GPUs. |
| Software Dependencies | No | The paper mentions using 'torch.distributed' and 'NVIDIA Collective Communication Library (NCCL) backend' but does not specify version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | All models were evaluated using 50 DDIM steps. In this context, N represents the number of segments into which the denoising model is divided, and S denotes the stride of denoising for each parallel computation batch. We also explore the effect of warm-up steps. |