Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Halton Scheduler for Masked Generative Image Transformer

Authors: Victor Besnier, Mickael Chen, David Hurych, Eduardo Valle, MATTHIEU CORD

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Evaluation of both class-to-image synthesis on Image Net and text-to-image generation on the COCO dataset demonstrates that the Halton scheduler outperforms the Confidence scheduler quantitatively by reducing the FID and qualitatively by generating more diverse and more detailed images. Our code is at https://github.com/valeoai/Halton-MaskGIT. This section presents a comprehensive evaluation of our method, focusing on the enhancements brought by our Halton scheduler in image quality and diversity compared to the baseline Confidence scheduler. We present qualitative and quantitative results on two distinct tasks, each using different modalities: class-to-image (subsection 4.1) and text-to-image (subsection 4.2).
Researcher Affiliation	Collaboration	Victor Besnier1 Mickael Chen2, David Hurych1 Eduardo Valle2 Matthieu Cord2,3 1Valeo.ai, Prague 2Valeo.ai, Paris 3Sorbonne Université, Paris now at H company, Paris {firstname}.{lastname}@valeo.com
Pseudocode	Yes	10 PSEUDO-CODE FOR HALTON SEQUENCE In algorithm 1, we detail the generation of the Halton sequence, producing a sequence of size n with a base b.
Open Source Code	Yes	Our code is at https://github.com/valeoai/Halton-MaskGIT.
Open Datasets	Yes	For our experiments in class-conditional image generation, we used the Image Net dataset (Deng et al., 2009)... For the text-to-image generation experiments, we employed a combination of real-world datasets, including CC12M (Changpinyo et al., 2021) and a subset of Segment Anything (Kirillov et al., 2023), as well as synthetic datasets such as Journey DB (Sun et al., 2024a) and Diffusion DB (Wang et al., 2022).
Dataset Splits	No	The paper mentions training on Image Net and evaluating on zero-shot COCO, but it does not specify explicit training/validation/test splits (e.g., percentages, exact counts, or specific predefined split names) for these or other datasets used (CC12M, Segment Anything, Journey DB, Diffusion DB) that would be needed to reproduce the data partitioning.
Hardware Specification	No	The paper mentions "Due to GPU memory constraints" and "on our GPU" in the training details, and acknowledges "Euro HPC Joint Undertaking for awarding us access to Karolina at IT4Innovations, Czech Republic." However, it does not provide specific model numbers for GPUs (e.g., NVIDIA A100), CPUs, or detailed specifications of the Karolina supercomputer's components used for the experiments.
Software Dependencies	No	The paper describes model architectures (e.g., ViT-XL, ViT-L, T5-XL encoder) and general optimization methods (Adam W, Cosine LR scheduler, bf16 precision) in Table 5 and 6. However, it does not provide specific version numbers for key software libraries or frameworks (e.g., PyTorch 1.x, TensorFlow 2.x, CUDA 11.x) that would be required for reproducibility.
Experiment Setup	Yes	Table 5 provides all the hyperparameters used to train our models across both modalities. Condition text-to-image class-to-image Training steps 5 x 10^5 2 x 10^6 Batch size 2048 256 Learning rate 5 x 10^-5 1 x 10^-4 Weight decay 0.05 5 x 10^-5 Optimizer Adam W Adam W Momentum β1 = 0.9, β2 = 0.999 β1 = 0.9, β2 = 0.96 Lr scheduler Cosine Cosine Warmup steps 2500 2500 Gradient clip norm 0.25 1 EMA 0.999 CFG dropout 0.1 0.1 Data aug. No Horizontal Flip Precision bf16 bf16