Noise Map Guidance: Inversion with Spatial Context for Real Image Editing
Authors: Hansam Cho, Jonghyun Lee, Seoung Bum Kim, Tae-Hyun Oh, Yonghyun Jeong
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical investigations highlight NMG s adaptability across various editing techniques and its robustness to variants of DDIM inversions. Compared to other inversion methods, NMG (a) demonstrates high fidelity editing when paired with Prompt-to-Prompt, (b) successfully conducts viewpoint alteration via Masa Ctrl, and (c) preserves the spatial context of the input image while performing zero-shot image-to-image translation with pix2pix-zero. |
| Researcher Affiliation | Collaboration | Hansam Cho1,2 , Jonghyun Lee1,2, Seoung Bum Kim1, Tae-Hyun Oh3,4 , Yonghyun Jeong2 1School of Industrial and Management Engineering, Korea University, 2NAVER Cloud 3Dept. of Electrical Engineering and Grad. School of Artificial Intelligence, POSTECH 4Institute for Convergence Research and Education in Advanced Technology,Yonsei University {chosam95, tomtom1103, sbkim1}@korea.ac.kr {taehyun}@postech.ac.kr, {yonghyun.jeong}@navercorp.com |
| Pseudocode | No | The paper describes the Noise Map Guidance (NMG) process using mathematical equations and descriptive text, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code can be found at https://github.com/hansam95/NMG. |
| Open Datasets | Yes | Our evaluation utilizes the MS-COCO validation set Lin et al. (2014), from which we randomly select 100 images. |
| Dataset Splits | Yes | Our evaluation utilizes the MS-COCO validation set Lin et al. (2014), from which we randomly select 100 images. We edit 20 images for each task and report the averaging scores in Table 1. |
| Hardware Specification | No | Training and experiments were done on the Naver Smart Machine Learning (NSML) platform (Kim et al., 2018). This mentions a platform but does not provide specific hardware details such as GPU models, CPU types, or memory amounts. |
| Software Dependencies | No | Within our experimental framework, we employ Stable Diffusion (Rombach et al., 2022), standardizing the diffusion steps to T = 50 across all experiments. While Stable Diffusion is mentioned, specific version numbers for this or any other software libraries (e.g., PyTorch, TensorFlow, Python version) are not provided. |
| Experiment Setup | Yes | For the editing tasks, the parameters are set as follows: noise map guidance s N = 10, text guidance s T = 10, and guidance scale sg = 5000. For the reconstruction tasks, the configurations are set to s N = 10, s T = 7.5, and sg = 10000. |