reproducibilityindex.ai

Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference

Authors: Zihao Yu, Haoyang Li, Fangcheng Fu, Xupeng Miao, Bin Cui

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	extensive empirical results show that FISEdit can be 3.4 and 4.4 faster than existing methods on NVIDIA TITAN RTX and A100 GPUs respectively, and even generates more satisfactory images.
Researcher Affiliation	Academia	Zihao Yu1, Haoyang Li1, Fangcheng Fu1, Xupeng Miao2, Bin Cui1,3 1 School of Computer Science & Key Lab of High Confidence Software Technologies (MOE), Peking University 2 Carnegie Mellon University 3 Institute of Computational Social Science, Peking University (Qingdao), China
Pseudocode	No	The paper describes the methods in text and figures, but no structured pseudocode or algorithm blocks are explicitly presented.
Open Source Code	Yes	We implement our system based on the Hugging Face s diffusers1, which is a generic framework for training and inference of diffusion models. We clone this project and integrate it with our self-developed sparse inference engine Hetu2 (Miao et al. 2022c,a,b)... and more details about our evaluation configurations can be found in our repository4. 4https://github.com/Hankpipi/diffusers-hetu
Open Datasets	Yes	We select LAION-Aesthetics (Schuhmann et al. 2022) as the evaluation dataset... The processed dataset5 consists of 454,445 examples... 5http://instruct-pix2pix.eecs.berkeley.edu/
Dataset Splits	No	The paper mentions using LAION-Aesthetics dataset and its total size, but does not specify the explicit percentages or counts for training, validation, or test splits.
Hardware Specification	Yes	Eventually, we accelerate text-to-image inference by up to 4.4 on NVIDIA TITAN RTX and 3.4 on NVIDIA A100 when the edit size is 5%.
Software Dependencies	No	The paper mentions 'Hugging Face s diffusers' and 'Hetu' as software components but does not provide specific version numbers for them.
Experiment Setup	Yes	We vary Instruct Pix2Pix s image guidance scale between [1.0, 2.5], SDEdit s strength between [0.5, 0.75], DIFFEdit s strength between [0.5, 1.0], and edited size between [0.25, 0.75] for SDIP and our method.