Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Regularization by Texts for Latent Diffusion Inverse Solvers
Authors: Jeongsol Kim, Geon Yeong Park, Hyungjin Chung, Jong Chul YE
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our comprehensive experimental results demonstrate that TReg effectively mitigates ambiguity in inverse problems, improving both accuracy and efficiency. ... Our evaluation focuses on two key aspects: (1) the effectiveness of TReg in resolving ambiguity through text, and (2) the accuracy of the resulting solution, which includes alignment with both the text and the measurements. For further details on the experimental settings, please refer to the Appendix. ... As shown in Figure 3(b), TReg leads to consistent solutions corresponding to the given text description, while reconstructed images exhibit multiple solutions... This discrepancy is clearly observed in pixel-level variance in Figure 3(c). ... We prepare measurement-text sets where the text describes solution. ... we report PSNR and FID of reconstructions in Table 1 and 2. ... Table 3 shows that TReg achieves superior performance compared to baselines. |
| Researcher Affiliation | Academia | Jeongsol Kim , Geon Yeong Park , Hyungjin Chung, Jong Chul Ye KAIST : Equal Contribution EMAIL |
| Pseudocode | Yes | To sum up, the proposed algorithm is described as in Algorithm 1. ... Algorithm 1 Inverse problem solving with TReg ... Algorithm 2 Inverse problem solving with TReg with DPS |
| Open Source Code | Yes | In this section, we provide further details on implementation of TReg. The code will be available to public on https://github.com/TReg-inverse/Treg. |
| Open Datasets | Yes | We decide to use Food-101 dataset (Bossard et al., 2014) for quantitative evaluation... For the qualitative comparison, we additionally use a validation set of Image Net, AFHQ, FFHQ and LHQ datasets. |
| Dataset Splits | Yes | We leverage 250 images from each of the "fried rice" and "ice cream" classes. ... The true image is sampled from Image Net validation set. ... Using benchmarks like FFHQ, AFHQ-cat, and AFHQ-dog, where ground-truth class labels (e.g., dog , cat ) are available... We constructed a 1k Image Net validation set encompassing all Image Net classes, following the approach used in P2L. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running its experiments, such as GPU models, CPU types, or memory. |
| Software Dependencies | No | We leverage the pre-trained Latent Diffusion Model (LDM), including an auto-encoder and the U-net model, provided by diffusers. ... The Stable-diffusion v1.5 is utilized for every experiment in this work, and the Vi T-L/14 backbone and its checkpoint is used for CLIP image encoder for adaptive negation. ... We use 5 iterations of CG update with λ = 1e 4 for each time step... we use Adam optimizer with learning rate 1e 3 and β1 = 0.9, β2 = 0.999. While software components are named (diffusers, Stable-diffusion v1.5, CLIP, Adam optimizer), explicit version numbers for libraries like diffusers or CLIP are not provided. |
| Experiment Setup | Yes | For linear inverse problems, we select bicubic super-resolution with scale factor 16, Gaussian deblur with kernel size 61 and sigma 5.0, and box inpainting... For all tasks, we add measurement noise that follows the Gaussian distribution with zero mean and noise scale σ2 0 = 0.01. ... we use 5 iterations of CG update with λ = 1e 4 for each time step... we set the Network Function Evaluation (NFE) to 200 and design Γ = {t|t mod 3 = 0, t 850}. ... For the Fourier phase retrieval, we set Γ = {t|t mod 10 = 0}. ... we use Adam optimizer with learning rate 1e 3 and β1 = 0.9, β2 = 0.999. ... For the main experiments with the Food101 dataset in the main paper, we use the default scale 7.5 for all results. For the Fourier phase retrieval problem, we set the CFG scale ω1 = ω2 = 4.0. |