Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
EditInfinity: Image Editing with Binary-Quantized Generative Models
Authors: Jiahuan Wang, Yuxin Chen, Jun Yu, Guangming Lu, Wenjie Pei
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on the PIE-Bench benchmark across add , change , and delete editing operations, demonstrate the superior performance of our model compared to state-of-the-art diffusion-based baselines. |
| Researcher Affiliation | Academia | 1Harbin Institute of Technology, Shenzhen 2The Hong Kong University of Science and Technology EMAIL EMAIL EMAIL EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1 Multi-scale Autoregressive Editing |
| Open Source Code | Yes | Code available at: https://github.com/yx-chen-ust/Edit Infinity. |
| Open Datasets | Yes | We conduct comprehensive experiments on PIE-Bench (Prompt-based Image Editing Benchmark) [20], the prevailing standard in image editing evaluation. This benchmark contains 700 test cases covering nine editing types. Each case provides a source image with a corresponding prompt, target editing prompt, and the editing mask. |
| Dataset Splits | Yes | We conduct comprehensive experiments on PIE-Bench (Prompt-based Image Editing Benchmark) [20], the prevailing standard in image editing evaluation. This benchmark contains 700 test cases covering nine editing types. Each case provides a source image with a corresponding prompt, target editing prompt, and the editing mask. |
| Hardware Specification | Yes | Inversion is trained on two NVIDIA L20 GPUs, and editing runs on a single NVIDIA L20 GPU. |
| Software Dependencies | No | We implement our method based on Infinity-2B 0. For editing, we set τ1 = 1 and τ2 = 4 in Equation 6. Inversion is trained on two NVIDIA L20 GPUs, and editing runs on a single NVIDIA L20 GPU. Refer to Supplementary Material A.2 for more details. --- A.2 Supplementary Implementation Details. During image inversion, we set the learning rate to 4.6875e-5 and use Adam W optimizer (β1 = 0.9, β2 = 0.97) for both the learnable prompt and Lo RA training. |
| Experiment Setup | Yes | We implement our method based on Infinity-2B 0. For editing, we set τ1 = 1 and τ2 = 4 in Equation 6. [...] During image inversion, we set the learning rate to 4.6875e-5 and use Adam W optimizer (β1 = 0.9, β2 = 0.97) for both the learnable prompt and Lo RA training. The two components are optimized sequentially, starting with the learnable prompt, followed by Lo RA. To accelerate the convergence of training Lo RA, a KL-divergence loss is introduced in addition to the standard cross-entropy loss. Typically, the learnable prompt is trained for 10 iterations, while Lo RA is trained for 20 iterations. |