Beta Diffusion

Authors: Mingyuan Zhou, Tianqi Chen, Zhendong Wang, Huangjie Zheng

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on both synthetic data and natural images demonstrate the unique capabilities of beta diffusion in generative modeling of range-bounded data and validate the effectiveness of KLUBs in optimizing diffusion models, thereby making them valuable additions to the family of diffusion-based generative models and the optimization techniques used to train them.
Researcher Affiliation Academia Mingyuan Zhou , Tianqi Chen, Zhendong Wang, and Huangjie Zheng The University of Texas at Austin Austin, TX 78712
Pseudocode Yes We summarize the training and sampling algorithms of beta diffusion in Algorithms 1 and 2, respectively. ... Algorithm 1 Training of Beta Diffusion ... Algorithm 2 Sampling of Beta Diffusion
Open Source Code Yes Corresponding to: mingyuan.zhou@mccombs.utexas.edu Py Torch code is available at: https://github.com/mingyuanzhou/Beta-Diffusion
Open Datasets Yes Our experiments, conducted on two synthetic data and the CIFAR10 images, primarily aim to showcase beta diffusion s effectiveness in generating range-bounded data. ... For the CIFAR-10 dataset2, we utilize the parameterization of EDM3 [34] as the code base. ... 2https://www.cs.toronto.edu/~kriz/cifar.html
Dataset Splits No The paper does not explicitly provide training/test/validation dataset splits, though it refers to using CIFAR-10, which has standard splits.
Hardware Specification Yes One limitation of beta diffusion is that its training is computationally expensive and data-intensive, akin to Gaussian diffusion. Specifically, with four Nvidia RTX A5000 GPUs, beta diffusion and Gaussian diffusion (VP-EDM) both take approximately 1.46 seconds to process 1000 images of size 32 32 3.
Software Dependencies Yes Py Torch code is available at: https://github.com/mingyuanzhou/Beta-Diffusion
Experiment Setup Yes We set η = 10000, π = 0.95, and ω = 0.5. As the data already falls within the range of 0 to 1, necessitating neither scaling nor shifting, we set Scale = 1 and Shift = 0. We use the same structured generator fθ for both Gaussian and beta diffusion. We choose 20-dimensional sinusoidal position embeddings [63], with the positions set as 1000t. The network is an MLP structured as (21-256)-Re LU-(256-256)-Re LU-(256-1). We utilize the Adam optimizer with a learning rate of 5e-4 and a mini-batch size of 1000.