Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Self-Supervised Selective-Guided Diffusion Model for Old-Photo Face Restoration

Authors: Wenjie Li, Xiangyi Wang, Heng Guo, Guangwei Gao, Zhanyu Ma

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiments 4.1 Experiments Setting 4.2 Comparisons with Existing Methods 4.3 Ablation Studies Table 1: Quantitative comparison of old-photo face images, which are categorized as simple, medium, and hard based on degradation degree.
Researcher Affiliation Academia Wenjie Li1, Xiangyi Wang1, Heng Guo1 , Guangwei Gao2, Zhanyu Ma1 1Beijing University of Posts and Telecommunications 2Nanjing University of Science and Technology EMAIL EMAIL
Pseudocode Yes Algorithm 1 : Staged Self-Supervised Sampling in our SSDiff.
Open Source Code Yes Code link: https://github.com/PRIS-CV/SSDiff.
Open Datasets Yes Our Appendix provides results on public test sets, including Web Photo-Test [6], and Celeb A-Chlid [6].
Dataset Splits Yes For evaluation, we randomly collect 300 old face photographs from the Internet as our benchmark, called Vintage Face. All images are cropped and aligned to 512 512 using the open-source Face Lib library1. We further categorize them into three levels of degradation: simple, medium, and hard, based on image quality, 100 images each.
Hardware Specification Yes All experiments are implemented in Py Torch framework on an NVIDIA RTX 4090 GPU.
Software Dependencies No All experiments are implemented in Py Torch framework on an NVIDIA RTX 4090 GPU. We use the pre-trained real-time model Bise Net [36] and the scratch detection model from [37] to obtain face parsing maps and scratch masks from inputs, respectively.
Experiment Setup Yes Our pre-trained diffusion model is an unconditional denoising network trained on FFHQ [17] datasets, which learns to reconstruct high-quality faces from pure noise over T = 1000 steps. We restore breakage facial regions during the first 600 steps, and apply color migration in the remaining 400 steps (T1 = 400). The strong gradient factor of our SSDiff is ss = 3.5e 3. PGDiff [16] is also set to this value for fairness. ... Thus, we set sw = 1e 3 and ss = 3.5e 3 to obtain best results.