Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
ItDPDM: Information-Theoretic Discrete Poisson Diffusion Model
Authors: Sagnik Bhattacharya, Abhiram Gorle, Ahsan Bilal, Connor Ding, Amit Kumar Singh Yadav, Tsachy Weissman
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on synthetic datasets with varied data distributions show that It DPDM outperforms earlier baselines in Wasserstein-1 distance and log-likelihood (NLL). Empirically, It DPDM achieves significantly lower NLL estimates on CIFAR-10 and Lakh MIDI datasets while maintaining competitive generation quality. Section 5: Experiments. |
| Researcher Affiliation | Academia | Sagnik Bhattacharya1 , Abhiram R. Gorle1 , Ahsan Bilal2, Connor Ding1, Amit Kumar Singh Yadav3, Tsachy Weissman1 1Department of Electrical Engineering, Stanford University 2Department of Computer Science, Oklahoma University 3School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA |
| Pseudocode | Yes | Algorithm 1 It DPDM Training Require: Dataset {xi}N i=1, # log-SNR samples S, SNR range [γmin, γmax], denoiser fθ 1: for s = 1, . . . , S do 2: Sample mini-batch B from {xi} 3: Sample α Logistic, γ exp(α) 4: Sample zγ Poisson(γ x B) 5: ˆx B fθ data_transform(zγ), γ 6: ℓ P i B PRL(xi, ˆxi), L ℓ/ q(α) 7: Update θ by gradient descent on L 8: end for 9: return θ Algorithm 2 It DPDM Sampling Require: Trained model fθ, # reverse steps T 1: Compute {γt} (e.g. spaced in log-SNR) 2: Initialize zγT 0 3: for t = T, T 1, . . . , 1 do 4: ˆx0 fθ data_transform(zγt), γt 5: Sample zγt 1 Poisson γt 1 ˆx0 6: end for 7: return ˆx0 |
| Open Source Code | Yes | Our implementation is available here. |
| Open Datasets | Yes | Experiments on synthetic datasets with varied data distributions show that It DPDM outperforms earlier baselines in Wasserstein-1 distance and log-likelihood (NLL). We evaluate It DPDM on two discrete datasets: CIFAR-10 images and Lakh MIDI (LMD) symbolic music and compare against existing baselines: Improved DDPM (IDDPM) [38], informationtheoretic Gaussian diffusion (ITDiff) [12], discrete masking-based (D3PM) [17], and learning-to-jump (LTJ) [20]. CIFAR-10 comprises 60,000 color images (32 32) across 10 classes [39]. LMD contains 648,574 symbolic music sequences of 1024 integers: 0 (rest), 1 (continuation), and 2 89 representing note pitches [40]. |
| Dataset Splits | Yes | We adopt an 80-20 train-test split for evaluating likelihoods. |
| Hardware Specification | No | The paper reports compute details in Appendix, including the use of AWS instances, running time and memory requirements. Training times and number of epochs per experiment are specified. While the paper focuses on final experiments, the total compute for all reported results is modest relative to standard diffusion models. |
| Software Dependencies | No | The training starts with a learning rate of 2 10 5 using the Adam optimizer. We adopt an 80-20 train-test split for evaluating likelihoods. For image generation, we use a UNet-based model[41], while for music generation, we employ the Dense DDPM[42] and convolutional-transformer[17]-based models for the continuous embeddings (DDPM-style) and discrete domain (D3PM[17]) respectively. All audio-based metrics are computed using 10,000 ground-truth samples and 10,000 generated samples per model. To enable consistent audio evaluation, we first convert model-generated .npy files to MIDI format using the pretty_midi library. These MIDI files are then rendered to WAV audio using Fluid Synth [58] with the Fluid R3_GM soundfont, ensuring uniform timbre across all samples. Fréchet Inception Distance (FID) was evaluated with the Py Torch torch-fidelity package (Inception-v3 network, 2048-dimensional pool3 activations). |
| Experiment Setup | Yes | Models are trained for 200 epochs using the Adam optimizer (η = 10 3, β1 = 0.9, β2 = 0.999) with a batch size of 128. The Gaussian DDPM employs a linear noise schedule βt [10 4, 2 10 2] over T = 100 diffusion steps. Our It DPDM framework adopts a linear gamma schedule γt [1.0, 0.0] over the same number of steps. For Poisson diffusion, the initial sample mean is set to 10.0. For a fair comparison, we train both CIFAR and LMD models from scratch for 600 epochs. The training starts with a learning rate of 2 10 5 using the Adam optimizer. |