Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
System-Embedded Diffusion Bridge Models
Authors: Bartlomiej Sobieski, Matthew Tivnan, Yuang Wang, Siyeop yoon, Pengfei Jin, Dufan Wu, Quanzheng Li, Przemyslaw Biecek
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments, Table 2: Quantitative comparison of SDB with the baselines across four inverse problems., Figure 2: Qualitative comparison of SDB (SB) with the best-performing baselines (bridge methods)., E.6 Ablation study on noise schedules |
| Researcher Affiliation | Academia | 1University of Warsaw, 2Harvard University, 3Massachusetts General Hospital, 4Warsaw University of Technology, Corresponding author at EMAIL |
| Pseudocode | Yes | Algorithm 1 SDB Training, Algorithm 2 SDB Sampling (Euler-Maruyama) |
| Open Source Code | Yes | We include the source code at https://github.com/sobieskibj/sdb. |
| Open Datasets | Yes | We evaluate SDB on four inverse problems with varying measurement system complexities, using original images at a resolution of 256 256. Building on prior work [Luo et al., 2023a, Yue et al., 2024], we first consider inpainting on Celeb A-HQ [Karras et al., 2018]... we examine superresolution on DIV2K [Agustsson and Timofte, 2017, Timofte et al., 2017]... CT reconstruction on the RSNA Intracranial Hemorrhage dataset [Anouk Stein et al., 2019] and MRI reconstruction on the Br35H dataset [Merlin, 2022]... motion deblurring task on 128 128 flower images from the Flowers102 dataset [Nilsback and Zisserman, 2008]. |
| Dataset Splits | Yes | Following standard evaluation practice [Luo et al., 2023a, Yue et al., 2024], we report perceptual scores (FID [Heusel et al., 2017], LPIPS [Zhang et al., 2018]) and reconstruction metrics (PSNR, SSIM)., each supervised method learns a mapping between signal samples and their PRs... we train score networks from scratch using the training hyperparameters and architecture of Luo et al. [2023a], with 256 training epochs for supervised methods and 512 for unsupervised ones |
| Hardware Specification | Yes | All experiments were conducted on a cluster of NVIDIA A100 GPUs, with each method trained using a single GPU. |
| Software Dependencies | No | We follow the training procedure proposed by Luo et al. [2023a], using the ADAM optimizer [Kingma and Ba, 2015]... or gradient-based operations available in autodifferentiation frameworks such as Py Torch |
| Experiment Setup | Yes | We follow the training procedure proposed by Luo et al. [2023a], using the ADAM optimizer [Kingma and Ba, 2015] with an initial learning rate of 1 10 4, no weight decay, and (β1, β2) = (0.9, 0.99). A multi-step learning rate scheduler is applied, halving the learning rate at the 36th, 60th, 72nd, and 90th epochs, as in the original work. All methods are trained using the ℓ1 loss function with a batch size of 8. |