Object-Aware Inversion and Reassembly for Image Editing

Authors: Zhen Yang, Ganggui Ding, Wen Wang, Hao Chen, Bohan Zhuang, Chunhua Shen

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that our method achieves superior performance in editing object shapes, colors, materials, categories, etc., especially in multi-object editing scenarios.
Researcher Affiliation Academia 1 Zhejiang University, China {zheny.cs,dingangui,wwenxyz,haochen.cad,chunhuashen}@zju.edu.cn 2 Monash University, Australia bohan.zhuang@monash.edu
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The project page can be found here.
Open Datasets No To systematically evaluate the proposed method, we collect two new datasets containing 208 and 100 single- and multi-object text-image pairs, respectively. Both quantitative and qualitative experiments demonstrate that our method achieves competitive performance in single-object editing, and outperforms state-of-the-art (SOTA) methods by a large margin in multi-object editing scenarios.
Dataset Splits No No specific training, validation, or test dataset splits (percentages, counts, or predefined splits) were explicitly provided in the paper.
Hardware Specification Yes All our experiments are conducted on the Ge Force RTX 3090.
Software Dependencies No We use Diffusers5 implementation of Stable Diffusion v1.4 6 in our experiments. ... We employ the CLIP base model ... and use Grounded-SAM7 to generate masks.
Experiment Setup Yes For DDIM Inversion, we used a uniform setting of 50 steps. ... The random seed is set to 1 for all experiments. ... In our experiments, the re-inversion step ire is also set to 20% of the total inversion steps, as we empirically found that it performs well for most situations.