Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference
Authors: Zihao Yu, Haoyang Li, Fangcheng Fu, Xupeng Miao, Bin Cui
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | extensive empirical results show that FISEdit can be 3.4 and 4.4 faster than existing methods on NVIDIA TITAN RTX and A100 GPUs respectively, and even generates more satisfactory images. |
| Researcher Affiliation | Academia | Zihao Yu1, Haoyang Li1, Fangcheng Fu1, Xupeng Miao2, Bin Cui1,3 1 School of Computer Science & Key Lab of High Confidence Software Technologies (MOE), Peking University 2 Carnegie Mellon University 3 Institute of Computational Social Science, Peking University (Qingdao), China |
| Pseudocode | No | The paper describes the methods in text and figures, but no structured pseudocode or algorithm blocks are explicitly presented. |
| Open Source Code | Yes | We implement our system based on the Hugging Face s diffusers1, which is a generic framework for training and inference of diffusion models. We clone this project and integrate it with our self-developed sparse inference engine Hetu2 (Miao et al. 2022c,a,b)... and more details about our evaluation configurations can be found in our repository4. 4https://github.com/Hankpipi/diffusers-hetu |
| Open Datasets | Yes | We select LAION-Aesthetics (Schuhmann et al. 2022) as the evaluation dataset... The processed dataset5 consists of 454,445 examples... 5http://instruct-pix2pix.eecs.berkeley.edu/ |
| Dataset Splits | No | The paper mentions using LAION-Aesthetics dataset and its total size, but does not specify the explicit percentages or counts for training, validation, or test splits. |
| Hardware Specification | Yes | Eventually, we accelerate text-to-image inference by up to 4.4 on NVIDIA TITAN RTX and 3.4 on NVIDIA A100 when the edit size is 5%. |
| Software Dependencies | No | The paper mentions 'Hugging Face s diffusers' and 'Hetu' as software components but does not provide specific version numbers for them. |
| Experiment Setup | Yes | We vary Instruct Pix2Pix s image guidance scale between [1.0, 2.5], SDEdit s strength between [0.5, 0.75], DIFFEdit s strength between [0.5, 1.0], and edited size between [0.25, 0.75] for SDIP and our method. |