Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Entropy Rectifying Guidance for Diffusion and Flow Models
Authors: Tariq Berrada, Adriana Romero-Soriano, Michal Drozdzal, Jakob J. Verbeek, Karteek Alahari
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that ERG results in significant improvements in various tasks, including text-to-image, class-conditional and unconditional image generation. We also show that ERG can be seamlessly combined with other recent guidance methods such as CADS and APG, further improving generation results. ... 4 Experimental evaluation |
| Researcher Affiliation | Collaboration | 1 FAIR at Meta 2 Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, France 3 Mc Gill University 4 Mila, Quebec AI institute 5 Canada CIFAR AI chair |
| Pseudocode | Yes | Algorithm 1 Entropy rectifying guidance |
| Open Source Code | Yes | By releasing our code transparently, we provide a way for researchers to study and counter the potential harmful effects of our method being misused, allowing for the development of defense strategies. |
| Open Datasets | Yes | We use a face-blurred version of Image Net (Deng et al., 2009) to train class-conditional models at 256 and 512 resolution... For the text-to-image model, we use an architecture similar to MMDi T (Esser et al., 2024), and train a 512 resolution model on a mix of a proprietary dataset of 320M text-image pairs and YFCC100M (Thomee et al., 2016)... For evaluation of text-to-image and unconditional generation, we use the 40k COCO 14 validation image-caption pairs. |
| Dataset Splits | Yes | For evaluation of text-to-image and unconditional generation, we use the 40k COCO 14 validation image-caption pairs. For the class-conditional models, we sample 50 images for each of the 1,000 Image Net classes and use the Image Net validation set as a reference. |
| Hardware Specification | No | The paper does not explicitly mention specific hardware details like GPU models, CPU types, or cloud instance specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using models like Llama3-8B and Flan-T5-XL, and libraries such as Eval GIM, but it does not specify explicit version numbers for software components (e.g., PyTorch 1.x, Python 3.x) which are required for a reproducible description of ancillary software. |
| Experiment Setup | Yes | All evaluated models are sampled using the Euler method with 50 sampling steps. We use the Eval GIM (Hall et al., 2024) library for all evaluations. Baselines. In addition to the standard classifier-free guidance, we compare our method to several recent state-of-the-art guidance techniques: Condition-Annealed Diffusion Sampler (CADS) (Sadat et al., 2024), Adaptive Projected Guidance (APG) (Sadat et al., 2025), Smooth Energy Guidance (SEG) (Hong, 2024), and Auto-Guidance (Karras et al., 2024). For APG, we follow the recommendations from the paper and set γAPG = 0.5, ηAPG = 0.0, r APG = 5.0. For CADS, we perform a grid search over τ CADS 1 [0.6, 0.8], τ CADS 2 [0.8, 1.0], s CADS [0.25, 1.0], ψCADS = 1.0. ... We used K = γ = 1 in our default setup in our experiments, unless specified otherwise. |