Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
A Gradient Guidance Perspective on Stepwise Preference Optimization for Diffusion Models
Authors: Joshua Tian Jin Tee, Hee Suk Yoon, Abu Hanif Muhammad Syarubany, Eunseop Yoon, Chang D. Yoo
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experimental Results 4.1 Experimental Setup Datasets and Models. We fine-tune both Stable Diffusion 1.5 [18] (Creativeml-openrail-m License) and SDXL [19] (Openrail++ License) models using the Grad SPO objective, as detailed in Section 3. |
| Researcher Affiliation | Academia | Korea Advanced Institute of Science and Technology (KAIST) EMAIL |
| Pseudocode | No | The paper describes methods using mathematical formulations and textual explanations but does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | Code and models are available at https://github.com/Joshua TTJ/Grad SPO. |
| Open Datasets | Yes | We train the models on 4,000 randomly sampled prompts from the Pick-a-Pic v1 dataset [20], which contains 580,000 pairs of image preference for each prompt. For evaluation, unless stated otherwise, we used the test set consisting of 500 prompts sourced from the Pick-a-Pic v2 dataset, similar to previous work in the field [12, 10]. |
| Dataset Splits | Yes | Following the SPO training scheme, we train the models on 4,000 randomly sampled prompts from the Pick-a-Pic v1 dataset [20]... For evaluation, unless stated otherwise, we used the test set consisting of 500 prompts sourced from the Pick-a-Pic v2 dataset, similar to previous work in the field [12, 10]. |
| Hardware Specification | Yes | GPU Setup 4x NVIDIA A100 |
| Software Dependencies | No | The paper mentions using Adam W [38] as the optimizer, but does not specify specific versions of software frameworks (e.g., PyTorch, TensorFlow) or other key libraries used for implementation. |
| Experiment Setup | Yes | Hyperparameters SD 1.5 SDXL Learning rate 6e-5 1e-5 # of epochs 10 10 Batch size 40 16 µ 0.9 0.9 β 10 10 κ [0, 750] [0, 750] Lo RA rank 4 64 cfg during training 5.0 5.0 # of samples per step 4 4 Sampling steps during training 20 20 GPU Setup 4x NVIDIA A100 4x NVIDIA A100; Additionally, the time-dependent weight function αt is set to 1, the guidance scale γt is fixed at 0.5, and the Exponential Moving Average (EMA) decay rate µ is set to 0.9. |