reproducibilityindex.ai

ParallelEdits: Efficient Multi-Aspect Text-Driven Image Editing with Attention Grouping

Authors: Mingzhen Huang, Jialing Cai, Shan Jia, Vishnu Lokhande, Siwei Lyu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Experiments Table 1: Comparison results in multi-aspect image editing on the PIE-Bench++ dataset.
Researcher Affiliation	Academia	Mingzhen Huang, Jialing Cai, Shan Jia, Vishnu Suresh Lokhande , Siwei Lyu University at Buffalo, State University of New York, USA
Pseudocode	Yes	A Parallel Edits: The Algorithm In this section we provide Algorithm 1: Early Aspect Grouping and Algorithm 2: Parallel Edits on a particular branch.
Open Source Code	Yes	Codes are available at: https://mingzhenhuang.github.io/projects/Parallel Edits.html. The code and data will be open-sourced for academic use.
Open Datasets	Yes	Additionally, we introduce the PIE-Bench++ dataset, an expansion of the original PIE-Bench dataset, to better support evaluating image-editing tasks involving multiple objects and attributes simultaneously. Codes are available at: https://mingzhenhuang.github.io/projects/Parallel Edits.html. The code and data will be open-sourced for academic use.
Dataset Splits	No	The paper mentions "PIE-Bench++ dataset" which is used for evaluation but does not specify details of train/validation/test splits (e.g., percentages or counts) within the dataset for model training and evaluation.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies	Yes	Our proposed Parallel Edits is based on the Latent Consistency Model [32], with the publicly available LCM which is finetuned from Stable Diffusion v1.5.
Experiment Setup	Yes	During sampling, we perform LCM sampling [32] with 15 denoising steps, and the classifier-free guidance (CFG) is set to 4.0. Parallel Edits can control the editing strength by adjusting the CFG . There s a trade-off between achieving satisfactory inversion and robust editing ability. A higher CFG tends to produce stronger editing effects but may lower inversion results and identity preservation. We also set the hyper-parameter θ as 0.9 and β as 0.8 in our experiments, where θ, β are used to determine the edit type of a given edit action.