Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Informed Correctors for Discrete Diffusion Models

Authors: Yixiu Zhao, Jiaxin Shi, Feng Chen, Shaul Druckmann, Lester Mackey, Scott Linderman

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We use a synthetic example to illustrate the failure modes of existing samplers and show how informed correctors alleviate these problems. On the text8 and tokenized Image Net 256 256 datasets, our informed corrector consistently produces superior samples with fewer errors or improved FID scores for discrete diffusion models. These results underscore the potential of informed correctors for fast and high-fidelity generation using discrete diffusion.
Researcher Affiliation	Collaboration	Yixiu Zhao Stanford University EMAIL Jiaxin Shi Google Deep Mind EMAIL Feng Chen Stanford University EMAIL Shaul Druckmann Stanford University EMAIL Lester Mackey Microsoft Research New England EMAIL Scott Linderman Stanford University EMAIL
Pseudocode	Yes	Algorithm 1 Backward process with informed corrector steps
Open Source Code	Yes	Our code is available at https://github. com/lindermanlab/informed-correctors.
Open Datasets	Yes	On the text8 and tokenized Image Net 256 256 datasets, our informed corrector consistently produces superior samples with fewer errors or improved FID scores for discrete diffusion models. ... text8 dataset [44]. ... We evaluate our method on tokenized Image Net 256 256 using the VQ tokenizer from [43].
Dataset Splits	Yes	Following standard practice, we use the provided dataset splits and train on text chunks of length 256 for 1 million steps with batch size 512. ... All models are trained on the standard training split and evaluated without classifier-free guidance.
Hardware Specification	Yes	The main experiments in this paper are performed on a v3-128 TPU pod and a v3-8 TPU machine with Google cloud. The experiments on the Text8 dataset were performed on a machine with 8 NVIDIA H100 GPUs. The Image Net experiments are done on v3-128 TPU pod machines on Google cloud and training takes 100 hours of wall clock time.
Software Dependencies	No	The paper does not explicitly state software dependencies with specific version numbers.
Experiment Setup	Yes	We followed the standard dataset split and trained our models on text chunks of length 256 for 1 million steps with batch size 512. ... We train the model using Adam W [46] with an initial learning rate of 1e-4 and a batch size of 512. We use a linear learning rate warmup in the first 100 steps. We adopt a stepwise learning rate schedule, dropping the learning rate to 3.3e-5 at 2.11 106 steps and to 1e-5 at 2.4 106 steps. We stop the training at 2.5 106 steps.