reproducibilityindex.ai

AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Authors: Zigeng Chen, Xinyin Ma, Gongfan Fang, Zhenxiong Tan, Xinchao Wang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validated the broad applicability of Async Diff through extensive testing on several diffusion models. For text-to-image tasks, we experimented with three versions of Stable Diffusion: SD 1.5, SD 2.1 [43], and Stable Diffusion XL (SDXL) [41]. Additionally, we explored the effectiveness of Async Diff on video diffusion models using Stable Video Diffusion (SVD) [2] and Animate Diff [9].
Researcher Affiliation	Academia	Zigeng Chen, Xinyin Ma, Gongfan Fang, Zhenxiong Tan, Xinchao Wang National University of Singapore zigeng99@n.nus.edu, xinchao@nus.edu.sg
Pseudocode	No	The paper contains diagrams illustrating the asynchronous denoising process but no formal pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/czg1225/Async Diff
Open Datasets	Yes	We assess the zero-shot generation capability using the MS-COCO 2017 [29] validation set, which comprises 5,000 images and captions.
Dataset Splits	Yes	We assess the zero-shot generation capability using the MS-COCO 2017 [29] validation set, which comprises 5,000 images and captions.
Hardware Specification	Yes	All latency measurements were conducted on NVIDIA A5000 GPUs equipped with NVLINK Bridge. We tested inference speeds on the professional-grade NVIDIA RTX A5000, as well as the consumer-grade NVIDIA RTX 2080 Ti and NVIDIA RTX 3090 GPUs.
Software Dependencies	No	The paper mentions using 'torch.distributed' and 'NVIDIA Collective Communication Library (NCCL) backend' but does not specify version numbers for these or any other software dependencies.
Experiment Setup	Yes	All models were evaluated using 50 DDIM steps. In this context, N represents the number of segments into which the denoising model is divided, and S denotes the stride of denoising for each parallel computation batch. We also explore the effect of warm-up steps.