DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents
Authors: Yilun Xu, Gabriele Corso, Tommi Jaakkola, Arash Vahdat, Karsten Kreis
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate Dis Co Diff on toy data, several image synthesis tasks as well as molecular docking, and find that introducing discrete latents consistently improves model performance. For example, Dis Co-Diff achieves state-of-the-art FID scores on class-conditioned Image Net-64/128 datasets with ODE sampler. |
| Researcher Affiliation | Collaboration | 1NVIDIA 2MIT. Correspondence to: Yilun Xu <yilunx@nvidia.com>. |
| Pseudocode | Yes | We provide the algorithm pseudocode for training and sampling in Appendix C. |
| Open Source Code | Yes | Please see the source code in the Supplementary Material for all low-level details. |
| Open Datasets | Yes | We use the Image Net (Deng et al., 2009) dataset and tackle both class-conditional (at varying resolutions 64 64 and 128 128) and unconditional synthesis. |
| Dataset Splits | Yes | Data for training and evaluation comes from the PDBBind dataset (Liu et al., 2017) with time-based splits (complexes before 2019 for training and validation, selected complexes from 2019 for testing). |
| Hardware Specification | Yes | on a single NVIDIA A100 GPU. |
| Software Dependencies | No | No specific version numbers for key software components like Python, PyTorch, CUDA, RDKit, or e3nn were provided. Only software names were mentioned without versions. |
| Experiment Setup | Yes | We set the latent dimension to m = 10 and the codebook size to k = 100 in Dis Co-Diff. We use Heun s second-order method as ODE sampler, and a 12-layer Transformer as the auto-regressive model. |