Practical and Asymptotically Exact Conditional Sampling in Diffusion Models
Authors: Luhuan Wu, Brian Trippe, Christian Naesseth, David Blei, John P. Cunningham
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first find in simulation and in conditional image generation tasks that TDS provides a computational statistical trade-off, yielding more accurate approximations with many particles but with empirical improvements over heuristics with as few as two particles. We then turn to motif-scaffolding, a core task in protein design, using a TDS extension to Riemannian diffusion models; on benchmark tasks, TDS allows flexible conditioning criteria and often outperforms the state-of-the-art, conditionally trained model. |
| Researcher Affiliation | Academia | Luhuan Wu Columbia University lw2827@columbia.edu Brian L. Trippe* Columbia University blt2114@columbia.edu Christian A. Naesseth University of Amsterdam c.a.naesseth@uva.nl David M. Blei Columbia University david.blei@columbia.edu John P. Cunningham Columbia University jpc2181@columbia.edu |
| Pseudocode | Yes | Algorithm 1: Twisted Diffusion Sampler (TDS) |
| Open Source Code | Yes | Code: https://github.com/blt2114/twisted_diffusion_sampler |
| Open Datasets | Yes | On the MNIST dataset, we compare TDS to TDS-IS, Gradient Guidance, and IS. We next apply TDS to higher dimension datasets. Figure 2c shows samples from TDS (K = 16) using a pre-trained diffusion model and a pretrained classifier on the Image Net dataset (256 256 3 dimensions). |
| Dataset Splits | No | The paper mentions '10,000 validation images' for inpainting tasks in Appendix D.2.2, but does not explicitly provide the training/validation/test dataset splits (e.g., percentages or counts) for model training in a reproducible manner. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types) used to run its experiments. |
| Software Dependencies | No | The paper mentions 'guided diffusion codebase' and 'Res Net50 model' but does not provide specific version numbers for software dependencies or libraries. |
| Experiment Setup | Yes | The model architecture is based on the guided diffusion codebase with the following specifications: number of channels = 64, attention resolutions = '28,14,7', number of residual blocks = 3, learn sigma (i.e. to learn the variance of pθ(xt 1 | xt)) = True, resblock updown = True, dropout = 0.1, variance schedule = 'linear'. We trained the model for 60k epochs with a batch size of 128 and a learning rate of 10 4 on 60k MNIST training images. The model uses T = 1, 000 for training and T = 100 for sampling. |