Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

DP²O-SR: Direct Perceptual Preference Optimization for Real-World Image Super-Resolution

Authors: Rongyuan Wu, Lingchen Sun, Zhengqiang ZHANG, Shihao Wang, Tianhe Wu, Qiaosi Yi, Shuai Li, Lei Zhang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments across both diffusion-and flow-based T2I backbones demonstrate that DP2O-SR significantly improves perceptual quality and generalizes well to real-world benchmarks.
Researcher Affiliation	Collaboration	Rongyuan Wu1,2, Lingchen Sun1,2, Zhengqiang Zhang1,2, Shihao Wang1, Tianhe Wu2,3, Qiaosi Yi1,2, Shuai Li1, Lei Zhang1,2, 1The Hong Kong Polytechnic University 2OPPO Research Institute 3City University of Hong Kong
Pseudocode	No	The paper describes methods and formulas but does not include any clearly labeled pseudocode or algorithm blocks. For example, it presents the training objective of Diff-DPO as a mathematical formula (Eq. 1) rather than a step-by-step algorithm.
Open Source Code	Yes	Corresponding author https://github.com/cswry/DP2O-SR
Open Datasets	Yes	We evaluate DP2O-SR on the out-of-domain Real SR benchmark [5]
Dataset Splits	Yes	Among them, 30,000 images are used for post-training, while the remaining 100 images form the Syn-Test set.
Hardware Specification	Yes	All experiments are conducted on 8 A800 GPUs.
Software Dependencies	No	The paper mentions software like Stable Diffusion 2.0 (SD2) [1], FLUX.1-Dev (FLUX) [20], and the diffusers [39] library, but does not specify version numbers for these components.
Experiment Setup	Yes	The C-FLUX model is finetuned for the Real-ISR task using approximately 1 million high-quality images. We use a batch size of 32, a learning rate of 1 10 4, and train the model for 45,000 steps. The C-SD2 variant follows a similar setup, but is trained with a batch size of 256, a learning rate of 2 10 4, and 35,000 steps. DP2O-SR Training Configuration. We train DP2O-SR with a batch size of 1024, a learning rate of 2 10 5, and set the preference weighting hyperparameter β to 5,000. The model is trained for 1,000 iterations.