Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations
Authors: Kaiwen Xue, Yuhao Zhou, Shen Nie, Xu Min, Xiaolu Zhang, Jun Zhou, Chongxuan Li
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, using the same pre-trained model, our best solver significantly outperforms the original BFN sampler with a few (e.g., 10) number of function evaluations (NFE) under sample quality on both the CIFAR10 and text8 datasets, achieving a 5 20 times increase in speed for free (see Sec. 6 for details). |
| Researcher Affiliation | Collaboration | 1Gaoling School of AI, Renmin University of China, Beijing, China 2Department of Computer Science and Technology, Tsinghua University, Beijing, China 3Ant Group, Hangzhou, China. |
| Pseudocode | Yes | Algorithm 1 BFN-Solver++1 (on continuous data) ... Algorithm 2 BFN-Solver++2 (on continuous data) ... Algorithm 3 SDE-BFN-Solver++2 (on continuous data) ... Algorithm 4 SDE-BFN-Solver1 (on discrete data) ... Algorithm 5 SDE-BFN-Solver2 (on discrete data) ... Algorithm 6 BFN-Solver1 (on discrete data) ... Algorithm 7 BFN-Solver2 (on discrete data) |
| Open Source Code | Yes | Our code is available at https: //github.com/ML-GSAI/BFN-Solver. |
| Open Datasets | Yes | For continuous data, the model is trained on the CIFAR-10 (Krizhevsky & Hinton, 2009) dataset which contain 50K training images. For discrete data, the model is trained on the text8 (Mahoney, 2011) dataset which contains 90M consecutive characters, each character is a lower Latin letter a z or the whitespace token, giving a class number of 27. |
| Dataset Splits | No | The paper mentions training on CIFAR-10 and text8 datasets but does not explicitly provide details about training/validation/test splits (e.g., percentages, counts, or references to standard splits). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU types, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'pre-trained models provided by the BFN (Graves et al., 2023)' but does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | We slightly tune the hyperparameter η for our methods on different NFEs to get the best results, as detailed in Appendix D.3. |