Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

FairDD: Fair Dataset Distillation

Authors: Qihang Zhou, ShenHao Fang, Shibo He, Wenchao Meng, Jiming Chen

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental evaluations demonstrate that Fair DD significantly improves fairness compared to vanilla DDs, with a promising trade-off between fairness and accuracy. Its consistent superiority across diverse DDs, spanning Distribution and Gradient Matching, establishes it as a versatile FDD approach.
Researcher Affiliation	Academia	Qihang Zhou , Shenhao Fang , Shibo He , Wenchao Meng, Jiming Chen Zhejiang University EMAIL
Pseudocode	No	The paper describes methods and equations, but does not include a dedicated 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	Code is available at https: //github.com/zqhang/Fair DD.
Open Datasets	Yes	Comprehensive experiments are conducted on publicly available datasets with diverse types of bias, including foreground bias (FG), background bias (BG), combined BG & FG bias, and real-world bias. The evaluated datasets include synthetic datasets: C-MNIST (FG), C-MNIST (BG), Colored-FMNIST (FG), Colored-FMNIST (BG), and CIFAR10-S (BG & FG), as well as real-world datasets: Celeb A, UTKFace, and BFFHQ. For more details on these datasets, please refer to Appendix B and C.
Dataset Splits	Yes	Given a vast dataset T = {(xi, yi)}N i=1, DDs aim to condense original dataset T into a smaller dataset S = {(xi, yi)}M i=1 via distillation algorithm Alg with nerual networks, parameterized by θ. We test the model bias trained on S. C-MNIST (BG) adopts the same operation on the background and keeps the foreground unchanged. We also report the condensed ratio at IPC 10, 50, and 100, which is computed by the ratio of the condensed dataset size to the training set size. Table 8: Statistics for all datasets used in our paper. Datasets TA PA TA number PA number Training set size Test set size BR in Training set BR in Test set Condensed ratio 10 50 100 C-MNIST (FG) Digital number Digital color 10 10 60000 10000 0.90 balance 0.17% 0.83% 1.67% Additionally, all test sets are balanced, with equal sample sizes across groups.
Hardware Specification	Yes	Experiments are conducted on Py Torch 2.0.0 with a single NVIDIA RTX 3090 24GB GPU.
Software Dependencies	Yes	Experiments are conducted on Py Torch 2.0.0 with a single NVIDIA RTX 3090 24GB GPU.
Experiment Setup	Yes	Implementation details We default to BR of 0.9 for all synthetic original datasets to induce significant PA skew. We use distilled datasets S from different DDs to train and evaluate Conv Net with the same parameters, and then report the corresponding fairness and accuracy.