Eliminating the Cross-Domain Misalignment in Text-guided Image Inpainting
Authors: Muqi Huang, Chaoyue Wang, Yong Luo, Lefei Zhang
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show exceptional performance on leading datasets such as MS-COCO and Open Images, surpassing state-of-the-art text-guided image inpainting methods. |
| Researcher Affiliation | Collaboration | Muqi Huang1 , Chaoyue Wang2 , Yong Luo1 and Lefei Zhang1,3 1Institute of Artificial Intelligence, School of Computer Science, Wuhan University 2JD Explore Academy 3Hubei Luojia Laboratory |
| Pseudocode | No | The paper describes the method and architecture but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is released at: https://github.com/MucciH/ECDM-inpainting. |
| Open Datasets | Yes | We fine-tune our model on the standard MS-COCO dataset [Lin et al., 2014], which comprises over 100k images in the training set. For testing, we utilize 5k imagetext pairs from the MS-COCO validation set. To assess the robustness of our model to diverse data, we further validate its performance on 1.5k images from the Open Images dataset [Kuznetsova et al., 2020]. |
| Dataset Splits | Yes | For testing, we utilize 5k imagetext pairs from the MS-COCO validation set. |
| Hardware Specification | Yes | Each experiment necessitate the utilization of one A100 GPU. |
| Software Dependencies | Yes | We employ our proposed Structure-Aware Inpainting Learning (SAIL) approach for image inpainting under the architecture of Control Net and it is finetuned from Controlnet v1.1 In Paint Version. |
| Experiment Setup | Yes | The learning rate is set at 5e-5, and the batch size is configured to be 4. |