Blackout Diffusion: Generative Diffusion Models in Discrete-State Spaces
Authors: Javier E. Santos, Zachary R. Fox, Nicholas Lubbers, Yen Ting Lin
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments on the CIFAR-10, Binarized MNIST, and Celeb A datasets confirm the feasibility of our approach. |
| Researcher Affiliation | Academia | 1Computational Earth Science Group (EES-16), Earth and Environmental Sciences Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA 2Information Sciences Group (CCS-3), Computer, Computational and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA 3Currently at Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA. |
| Pseudocode | Yes | We summarize the proposed training and inference methods in Algorithms 1 and 2. Algorithm 1 Training Blackout Diffusion. Algorithm 2 Generating images by Blackout Diffusion. Algorithm 3 Training for general Markov processes. Algorithm 4 Generating images by τ-leaping. |
| Open Source Code | Yes | The codes we developed to perform the experiments are deposited at https://github.com/lanl/Blackout-Diffusion, with a C-number C23047 approved by the Richard P. Feynman Center for Innovation (FCI) at the Los Alamos National Laboratory. |
| Open Datasets | Yes | Numerical experiments on the CIFAR-10, Binarized MNIST, and Celeb A datasets confirm the feasibility of our approach. |
| Dataset Splits | No | The paper mentions using CIFAR-10, Binarized MNIST, and Celeb A datasets for numerical experiments and training models, and evaluates results with FID and IS. However, it does not explicitly specify distinct training, validation, and test dataset splits with percentages or sample counts for reproducibility beyond mentioning the datasets themselves. |
| Hardware Specification | Yes | The training was carried out by two NVIDIA A100 GPUs for 72 hr. |
| Software Dependencies | No | The paper mentions using the 'improved Noise Conditional Score Network (NCSN++) architecture', but it does not provide specific version numbers for software components or libraries (e.g., Python, PyTorch, CUDA). |
| Experiment Setup | Yes | We fixed T = 1000 in this feasibility experiment. We used mini-batches with 256 samples, and the training was stopped at 300K iterations, above which we observed degraded quality of the generated samples. For image generation, we chose t T = 15. We performed the analysis on both loss functions, Eqs. (11) and Eq. (12), and by both binomial and Poisson sampling. We did not modify other hyperparameters of the NSCN++. |