Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Guiding Instruction-based Image Editing via Multimodal Large Language Models
Authors: Tsu-Jui Fu, Wenze Hu, Xianzhi Du, William Yang Wang, Yinfei Yang, Zhe Gan
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate various aspects of Photoshop-style modification, global photo optimization, and local editing. Extensive experimental results demonstrate that expressive instructions are crucial to instruction-based image editing, and our MGIE can lead to a notable improvement in automatic metrics and human evaluation while maintaining competitive inference efficiency. |
| Researcher Affiliation | Collaboration | Tsu-Jui Fu1, Wenze Hu2, Xianzhi Du2, William Yang Wang1, Yinfei Yang2, Zhe Gan2 1UC Santa Barbara, 2Apple |
| Pseudocode | Yes | Algorithm 1 MLLM-Guided Image Editing |
| Open Source Code | No | The paper mentions a "Project website: https://mllm-ie.github.io" but does not explicitly state that the source code for the methodology is available there, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | We use IPr2Pr (Brooks et al., 2023) as our pre-training data. [...] For a comprehensive evaluation, we consider various editing aspects. EVR (Tan et al., 2019), GIER (Shi et al., 2020), MA5k (Shi et al., 2022), and Magic Brush (Zhang et al., 2023a). |
| Dataset Splits | Yes | We treat the same training/validation/testing split as the original settings. |
| Hardware Specification | Yes | All experiments are conducted in Py Torch (Paszke et al., 2017) on 8 A100 GPUs. |
| Software Dependencies | No | The paper states that experiments are conducted in "Py Torch (Paszke et al., 2017)" but does not specify a version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | The learning rates of the MLLM and F are 5e-4 and 1e-4, respectively. All experiments are conducted in Py Torch (Paszke et al., 2017) on 8 A100 GPUs. We adopt Adam W (Loshchilov & Hutter, 2019) with the batch size of 128 to optimize MGIE. [...] During inference, we use V = 1.5 and X = 7.5. |