Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Practical and Asymptotically Exact Conditional Sampling in Diffusion Models
Authors: Luhuan Wu, Brian Trippe, Christian Naesseth, David Blei, John P. Cunningham
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first find in simulation and in conditional image generation tasks that TDS provides a computational statistical trade-off, yielding more accurate approximations with many particles but with empirical improvements over heuristics with as few as two particles. We then turn to motif-scaffolding, a core task in protein design, using a TDS extension to Riemannian diffusion models; on benchmark tasks, TDS allows flexible conditioning criteria and often outperforms the state-of-the-art, conditionally trained model. |
| Researcher Affiliation | Academia | Luhuan Wu Columbia University EMAIL Brian L. Trippe* Columbia University EMAIL Christian A. Naesseth University of Amsterdam EMAIL David M. Blei Columbia University EMAIL John P. Cunningham Columbia University EMAIL |
| Pseudocode | Yes | Algorithm 1: Twisted Diffusion Sampler (TDS) |
| Open Source Code | Yes | Code: https://github.com/blt2114/twisted_diffusion_sampler |
| Open Datasets | Yes | On the MNIST dataset, we compare TDS to TDS-IS, Gradient Guidance, and IS. We next apply TDS to higher dimension datasets. Figure 2c shows samples from TDS (K = 16) using a pre-trained diffusion model and a pretrained classifier on the Image Net dataset (256 256 3 dimensions). |
| Dataset Splits | No | The paper mentions '10,000 validation images' for inpainting tasks in Appendix D.2.2, but does not explicitly provide the training/validation/test dataset splits (e.g., percentages or counts) for model training in a reproducible manner. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types) used to run its experiments. |
| Software Dependencies | No | The paper mentions 'guided diffusion codebase' and 'Res Net50 model' but does not provide specific version numbers for software dependencies or libraries. |
| Experiment Setup | Yes | The model architecture is based on the guided diffusion codebase with the following specifications: number of channels = 64, attention resolutions = '28,14,7', number of residual blocks = 3, learn sigma (i.e. to learn the variance of pĪø(xt 1 | xt)) = True, resblock updown = True, dropout = 0.1, variance schedule = 'linear'. We trained the model for 60k epochs with a batch size of 128 and a learning rate of 10 4 on 60k MNIST training images. The model uses T = 1, 000 for training and T = 100 for sampling. |