Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Posterior Sampling by Combining Diffusion Models with Annealed Langevin Dynamics
Authors: Zhiyang Xun, Shivam Gupta, ecprice
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To validate our theoretical analysis and assess real-world performance, we study three inverse problems on FFHQ 256 [KLA21]: inpainting, 4 super-resolution, and Gaussian deblurring. Experiments use 1k validation images and the pre-trained diffusion model from [CKM+23]. Forward operators are specified as in [CKM+23]: inpainting masks 30% 70% of pixels uniformly at random; super-resolution downsamples by a factor of 4; deblurring convolves the ground-truth with a Gaussian kernel of size 61 61 (std. 3.0). We first obtain initial reconstructions x0 via Diffusion Posterior Sampling (DPS) [DS24], then refine them with our annealed Langevin sampler to draw samples close to p(x | x0, y). To control runtime, we sweep the step size while keeping the annealing schedule fixed. For each step size, we report the per-image L2 distance to the ground truth and the FID of the resulting sample distribution (Figure 4). Across all three tasks, increasing the time devoted to annealed Langevin decreases L2 but increases FID; in the inpainting setting, when the step size is sufficiently small, our method surpasses DPS on both metrics. Qualitatively, our reconstructions better preserve ground-truth attributes compared to DPS (Figures 5 and 6). |
| Researcher Affiliation | Collaboration | Zhiyang Xun UT Austin EMAIL Shivam Gupta UT Austin EMAIL Eric Price UT Austin & Microsoft Research EMAIL |
| Pseudocode | Yes | Algorithm 1 Sampling from p(x | Ax + N(0, η2Im) = y) ... Algorithm 2 Sampling from p(x | x0, y) given an extra Gaussian measurement x0 ... Algorithm 3 Competitive Compressed Sensing Algorithm Given a Rough Estimation |
| Open Source Code | Yes | Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We provide open access to the code. The datasets we use are open access. |
| Open Datasets | Yes | To validate our theoretical analysis and assess real-world performance, we study three inverse problems on FFHQ 256 [KLA21]: inpainting, 4 super-resolution, and Gaussian deblurring. ... The datasets we use are open access. |
| Dataset Splits | Yes | Experiments use 1k validation images and the pre-trained diffusion model from [CKM+23]. |
| Hardware Specification | Yes | All experiments were run on a cluster with four NVIDIA A100 GPUs and required roughly two hours per task. |
| Software Dependencies | No | The paper does not explicitly list specific software versions (e.g., Python, PyTorch version numbers) used for implementation. It mentions a 'pre-trained diffusion model from [CKM+23]' but this refers to a model/paper, not a specific software dependency with a version number. |
| Experiment Setup | Yes | Forward operators are specified as in [CKM+23]: inpainting masks 30% 70% of pixels uniformly at random; super-resolution downsamples by a factor of 4; deblurring convolves the ground-truth with a Gaussian kernel of size 61 61 (std. 3.0). ... To control runtime, we sweep the step size while keeping the annealing schedule fixed. |