Noise Map Guidance: Inversion with Spatial Context for Real Image Editing

Authors: Hansam Cho, Jonghyun Lee, Seoung Bum Kim, Tae-Hyun Oh, Yonghyun Jeong

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical investigations highlight NMG s adaptability across various editing techniques and its robustness to variants of DDIM inversions. Compared to other inversion methods, NMG (a) demonstrates high fidelity editing when paired with Prompt-to-Prompt, (b) successfully conducts viewpoint alteration via Masa Ctrl, and (c) preserves the spatial context of the input image while performing zero-shot image-to-image translation with pix2pix-zero.
Researcher Affiliation Collaboration Hansam Cho1,2 , Jonghyun Lee1,2, Seoung Bum Kim1, Tae-Hyun Oh3,4 , Yonghyun Jeong2 1School of Industrial and Management Engineering, Korea University, 2NAVER Cloud 3Dept. of Electrical Engineering and Grad. School of Artificial Intelligence, POSTECH 4Institute for Convergence Research and Education in Advanced Technology,Yonsei University {chosam95, tomtom1103, sbkim1}@korea.ac.kr {taehyun}@postech.ac.kr, {yonghyun.jeong}@navercorp.com
Pseudocode No The paper describes the Noise Map Guidance (NMG) process using mathematical equations and descriptive text, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The source code can be found at https://github.com/hansam95/NMG.
Open Datasets Yes Our evaluation utilizes the MS-COCO validation set Lin et al. (2014), from which we randomly select 100 images.
Dataset Splits Yes Our evaluation utilizes the MS-COCO validation set Lin et al. (2014), from which we randomly select 100 images. We edit 20 images for each task and report the averaging scores in Table 1.
Hardware Specification No Training and experiments were done on the Naver Smart Machine Learning (NSML) platform (Kim et al., 2018). This mentions a platform but does not provide specific hardware details such as GPU models, CPU types, or memory amounts.
Software Dependencies No Within our experimental framework, we employ Stable Diffusion (Rombach et al., 2022), standardizing the diffusion steps to T = 50 across all experiments. While Stable Diffusion is mentioned, specific version numbers for this or any other software libraries (e.g., PyTorch, TensorFlow, Python version) are not provided.
Experiment Setup Yes For the editing tasks, the parameters are set as follows: noise map guidance s N = 10, text guidance s T = 10, and guidance scale sg = 5000. For the reconstruction tasks, the configurations are set to s N = 10, s T = 7.5, and sg = 10000.