PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator
Authors: Hanshu Yan, Xingchao Liu, Jiachun Pan, Jun Hao Liew, Qiang Liu, Jiashi Feng
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted extensive experiments to verify the effectiveness of Pe RFlow on accelerating pretrained diffusion models, including Stable Diffusion (SD) 1.5, SD 2.1, SDXL [32], and Animate Diff [6]. Pe RFlow shows advantages in terms of FID values, visual quality, and generation diversity. |
| Researcher Affiliation | Collaboration | Hanshu Yan*, Xingchao Liu+, Jiachun Pan#, Jun Hao Liew*, Qiang Liu+, Jiashi Feng* *Byte Dance, +Univeristy of Texas at Austin, #National University of Singapore |
| Pseudocode | Yes | Algorithm 1: Piecewise Rectified Flow |
| Open Source Code | Yes | Codes for training and inference have been publicly released. 1https://github.com/magic-research/piecewise-rectified-flow |
| Open Datasets | Yes | Images are all sampled from the LAION-Aesthetics-5+ dataset [37] and center-cropped. We compute the FID values of Pe RFlow-accelerated SDs in table 1 using images on three different reference distributions: (1) LAION-5B-Aesthetics [37], which is the training set of Pe RFlow and other methods; (2) MS COCO 2014 [17] validation dataset; (3) images generated from SDv1.5/XL with Journey DB [41] prompts. |
| Dataset Splits | No | The paper mentions using "MS COCO 2014 [17] validation dataset" as a reference distribution for FID calculation, but it does not specify a training/validation/test split for its main training datasets with percentages or sample counts. |
| Hardware Specification | Yes | All experiments are conducted with 16 NVIDIA A100 GPUs. |
| Software Dependencies | No | The paper mentions "Hugging Face scripts for training Stable Diffusion 2" but does not provide specific version numbers for software dependencies (e.g., Python, PyTorch, Diffusers library). |
| Experiment Setup | Yes | Pe RFlow-SD-v1.5 is trained with images in resolution of 512 512 using ϵ-prediction defined in (7). We randomly drop out the text captions with a low probability (10%) to enable classifier-free guidance during sampling. We divide the time range [0, 1] into four windows uniformly. For each window, we use the DDIM solver to solve the endpoints with 8 steps. |