Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations
Authors: Litu Rout, Yujia Chen, Nataniel Ruiz, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate RF inversion on stroke-to-image generation and image editing tasks, with additional qualitative results on cartoonization, object insertion, image generation, and content-style composition. Our method significantly improves photorealism in stroke-to-image generation, surpassing a state-of-the-art (So TA) method (Mokady et al., 2023) by 89%, while maintaining faithfulness to the input stroke. In addition, we show that RF inversion outperforms DM inversion (Meng et al., 2022) in faithfulness by 4.7% and in realism by 13.8% on LSUN-bedroom dataset (Wang et al., 2017). Figure 1 shows a graphical illustration of our method RF-Inversion. |
| Researcher Affiliation | Collaboration | 1 Google 2 UT Austin |
| Pseudocode | Yes | Algorithm 1: Controlled Forward ODE (8) Input: Discretization steps N, reference image y0, prompt embedding network Φ, Flux model u( , , ; ϕ), Flux noise scheduler σ : [0, 1] R Tunable parameter: Controller guidance γ Output: Structured noise Y1 |
| Open Source Code | Yes | See our project page https://rf-inversion.github.io/ for code and demo. ... Refer to our project page: https://rf-inversion.github.io/ for source code and demo. |
| Open Datasets | Yes | We evaluate RF inversion on stroke-to-image generation and image editing tasks, with additional qualitative results on cartoonization, object insertion, image generation, and content-style composition. ... We show that RF inversion outperforms DM inversion across three benchmarks: LSUN-church, LSUN-bedroom (Wang et al., 2017), and SFHQ (Beniaguev, 2022) on two tasks: Stroke2Image generation and image editing. |
| Dataset Splits | Yes | On the test split of LSUN bedroom dataset, our approach is 4.7% more faithful and 13.79% more realistic than the best optimization free method SDEdit-SD1.5. ... We conduct a user study on the test splits of both LSUN Bedroom and LSUN Church dataset using Amazon Mechanical Turk |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) were explicitly mentioned for running the experiments. |
| Software Dependencies | No | The paper mentions software like "NTI codebase", "Diffusers library", and "Flux" without specifying exact version numbers for these or other underlying libraries/frameworks (e.g., PyTorch, Python). |
| Experiment Setup | Yes | In Table 4, we provide the hyper-parameters for the empirical results reported in 5. We use a fix γ = 0.5 in our controlled forward ODE (8) and a time-varying guidance parameter ηt in our controlled reverse ODE (15), as motivated in Remark 3.3 and Remark 3.6. Thus, our algorithm introduces one additional hyper-parameter ηt into the Flux pipeline. For each experiment, we use a fixed time-varying schedule of ηt described by starting time (s), stopping time τ, and strength (η). We use the default config for Flux model: 3.5 for classifier-free guidance and 28 for the total number of inference steps. |