Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Discrete Modeling via Boundary Conditional Diffusion Processes
Authors: Yuxuan Gu, Xiaocheng Feng, Lei Huang, Yingsheng Wu, Zekun Zhou, Weihong Zhong, kun Zhu, Bing Qin
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results indicate that our approach achieves strong performance in both language modeling and discrete image generation tasks. In language modeling, our approach surpasses previous state-of-the-art continuous diffusion language models in three translation tasks and a summarization task, while also demonstrating competitive performance compared to auto-regressive transformers. Moreover, our method achieves comparable results to continuous diffusion models when using discrete ordinal pixels and establishes a new state-of-the-art for categorical image generation on the CIFAR-10 dataset. |
| Researcher Affiliation | Collaboration | Harbin Institute of Technology Peng Cheng Laboratory EMAIL |
| Pseudocode | Yes | Algorithm 1 Training; Algorithm 2 Sampling; Algorithm 3 Gaussian Sampling |
| Open Source Code | Yes | Our framework is a module constructed on current diffusion models. We demonstrate our kernel part rescale diffusion trajectory with pseudo python code as below: ... and we will public our code on github.com. |
| Open Datasets | Yes | Our approach is experimented in both language modeling and discrete image generation. On three machine translation datasets (IWSLT14 DE-EN [Cettolo et al., 2012], WMT14 EN-DE, WMT16 EN-RO) and a text summarization dataset (GIGAWORD [Rush et al., 2015]) for language modeling... For image generation on CIFAR-10 [Krizhevsky et al., 2009]... |
| Dataset Splits | Yes | Datasets used for experiments include three translation tasks (IWSLT14 DE-EN [Cettolo et al., 2012], WMT14 EN-DE, and WMT16 EN-RO1) and one text summarization task (GIGAWORD [Rush et al., 2015]) for language modeling, our proposed approach... We use CIFAR-10 [Krizhevsky et al., 2009] for discrete image generation. |
| Hardware Specification | Yes | Our experiments are performed with Nvidia 80G A100. Each language result requires about 2 days on one single A100. Each image result requires about a week on one single A100. |
| Software Dependencies | No | The paper mentions 'FAIRSEQ framework' but does not specify its version or the versions of other software dependencies like Python, PyTorch, etc. |
| Experiment Setup | Yes | During training, the diffusion step is T = 2000 and the confidence factor r = 1 for translation tasks since they have strong conditions, while r = 0.5 for summarization. Sentences are generated deterministically with 20 steps. ... The model is trained for 1.5M steps with the learning rate of 1e94 and batch size of 128. |