reproducibilityindex.ai

Tuning-Free Inversion-Enhanced Control for Consistent Image Editing

Authors: Xiaoyue Duan, Shuhao Cui, Guoliang Kang, Baochang Zhang, Zhengcong Fei, Mingyuan Fan, Junshi Huang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that the proposed method outperforms previous works in reconstruction and consistent editing, and produces impressive results in various settings. We first quantitatively evaluate the reconstruction quality of different inversion-based methods on 200 randomly selected images from the MS-COCO validation set. We measure the reconstruction quality by Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity (SSIM), and efficiency by reconstruction time (Time). As provided in Table 1, the reconstruction quality of our method is significantly superior to DDIM reconstruction, attaining a level of reconstruction that is comparable to VAE, which serves as an upper bound for reconstruction.
Researcher Affiliation	Collaboration	1School of Automation Science and Electrical Engineering, Beihang University, China 2Meituan 3Hangzhou Research Institute, Beihang University, China 4Zhongguancun Laboratory, Beijing, China 5Nanchang Institute of Technology, Nanchang, China
Pseudocode	Yes	Algorithm 1: Tuning-free Inversion-enhanced Control (TIC) for Consistent Image Editing
Open Source Code	No	The paper does not include an unambiguous statement that the authors are releasing the code for the work described, nor does it provide a direct link to a source-code repository.
Open Datasets	Yes	For the dataset, we evaluate the reconstruction quality of VAE, DDIM, NTI, PTI and our method on 200 randomly selected images from the MS-COCO 2017 validation set (Lin et al. 2014).
Dataset Splits	No	The paper mentions using 200 randomly selected images from the MS-COCO 2017 validation set for evaluation, but does not provide specific details on training, validation, or test splits for any model training or fine-tuning conducted by the authors.
Hardware Specification	No	The paper does not provide specific hardware details (such as GPU/CPU models or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using 'Stable Diffusion v1.4' and classifier-free guidance settings, but does not provide specific versions of ancillary software libraries or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA versions) needed for replication.
Experiment Setup	Yes	For the DDIM schedule, we perform both inversion and sampling for 50 steps, and retain the original hyperparameter choices of Stable Diffusion. The classifier-free guidance (CFG) scale is set to 7.5 for editing. The step and layer to start TIC is set to t0 =4 and l0 =10, respectively.