Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Entropic Time Schedulers for Generative Diffusion Models

Authors: Dejan Stancevic, Florian Handke, Luca Ambrogioni

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments with mixtures of Gaussian distributions and Image Net, we show that using the (rescaled) entropic times greatly improves the inference performance of trained models. In particular, we found that the image quality in pretrained EDM2 models, as evaluated by FID and FD-DINO scores, can be substantially increased by the rescaled entropic time reparameterization without increasing the number of function evaluations, with greater improvements in the few NFEs regime.
Researcher Affiliation	Academia	Dejan Stanˇcevi c Radboud University Florian Handke Ghent University Luca Ambrogioni Radboud University
Pseudocode	Yes	Algorithm 1 Sampling using time change... Algorithm 2 Estimation of rescaled entropy, R σ(τ) H[x0\|xτ]dτ... Algorithm 3 Estimation of spectral decomposition of ϵ2(t)
Open Source Code	Yes	Code is available at https://github.com/Dejan Stancevic/ Entropic-Time-Schedulers-for-Generative-Diffusion-Models.
Open Datasets	Yes	We compare the performance of trained EDM and EDM2 models (Karras et al., 2022, 2024) on CIFAR10 (Krizhevsky et al., 2009), FFHQ(Karras et al., 2019), and Image Net (Russakovsky et al., 2015) using the FID (Heusel et al., 2017) and FD-DINOv2 (Oquab et al., 2023; Stein et al., 2023) scores.
Dataset Splits	No	The paper uses well-known public datasets (CIFAR10, FFHQ, Image Net) and mentions generating 50000 images for evaluation against pre-computed reference statistics. However, it does not specify the train/test/validation splits used for the underlying pre-trained models (EDM, EDM2) or for any new training performed by the authors within this paper. The paper does not provide specific dataset split information.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. The NeurIPS Paper Checklist also explicitly states for this question: 'Answer: [No] Justification:'.
Software Dependencies	No	The paper mentions using implementations provided by https://github.com/NVlabs/edm, for CIFAR10 and FFHQ, and https://github.com/NVlabs/edm2, for Image Net. However, it does not specify version numbers for any software dependencies (e.g., Python, PyTorch, CUDA, etc.).
Experiment Setup	Yes	For generating samples, we used the stochastic and deterministic DDIM (Song et al., 2022). To compare performance between different runs, we used the FID (Heusel et al., 2017) and, for Image Net, FD-DINOv2 (Oquab et al., 2023; Stein et al., 2023) scores... For all data sets, entropy and rescaled entropy were calculated using an estimation of squared error using equation 13. The squared error was estimated at 128 time points according to the EDM schedule (ρ = 7, σmin = 0.002, σmax = 80) using the Monte-Carlo method with 1024 samples at each timestep.