Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Fairness-aware Bayes Optimal Functional Classification

Authors: Xiaoyu Hu, Gengyu Xue, Zhenhua Lin, Yi Yu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our theoretical ﬁndings are complemented by extensive numerical experiments on synthetic and real datasets, highlighting the practicality of our designed algorithm.
Researcher Affiliation	Academia	School of Mathematics and Statistics, Xi an Jiaotong University Department of Statistics, University of Warwick Department of Statistics and Data Science, National University of Singapore Department of Statistics, University of Warwick
Pseudocode	Yes	Algorithm 1 Fair Functional Linear Discriminant Analysis classiﬁer.
Open Source Code	Yes	We have submitted code including those generating all the numerical results in this paper.
Open Datasets	Yes	For the real dataset, we use the 2005-2006 National Health and Nutrition Examination Survey data (CDC, 2006), where the sensitive attribute is race and the classiﬁcation task is to determine if an individual is under 20 or over 50 years old based on the quantile function of intensity values. ... The real dataset is obtained from https://wwwn.cdc.gov/nchs/nhanes/Continuous Nhanes/Default.aspx?Begin Year=2005.
Dataset Splits	Yes	The ﬁnal dataset consists of 3252 instances, which we randomly split into equal-sized training and test subsets. ... Truncation levels are selected via 5-fold cross-validation, speciﬁcally by minimising the average classiﬁcation error associated with the unconstrained classiﬁer. ... One subset is used to estimate bηa and bπa,y, while the other is used to estimate the threshold bτ. ... To mitigate the randomness caused by random splitting, we adopt a cross-ﬁtting approach and deﬁne the ﬁnal probabilistic classiﬁer as the average bf = ( bf1 + bf2)/2.
Hardware Specification	Yes	Experiments were conducted on a server equipped with an Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz (28 cores) and 503GB of RAM.
Software Dependencies	Yes	We implemented all methods in R (version 4.3.1).
Experiment Setup	Yes	Truncation levels are selected via 5-fold cross-validation, speciﬁcally by minimising the average classiﬁcation error associated with the unconstrained classiﬁer. Fair-FLDA: calibration constant set to 0; Fair-FLDAc: calibration constant set to min{ p 2 log(1/ρ)/n, δ}, with ρ = 0.05. For the simulation results, we generate (Y, A) {0, 1} 2 according to the distributions P(A = 1) = 0.7, P(Y = 1\|A = 0) = 0.4 and P(Y = 1\|A = 1) = 0.7. Given Y = y and A = a, generate the functional covariate Xa,y(t) as Xa,y(t) = µa,y(t) + P50 k=1 ζa,kφk(t), where φk(t) = 2 cos(kπt), ζa,k N(0, λa,k), λ0,k = k 2, λ1,k = 2k 2, and the mean functions are speciﬁed as follows, µ0,0 = µ1,0 = 0, µ0,1(t) = k=1 0.8( 1)kk βφk(t), µ1,1(t) = 2( 1)kk βφk(t). Let β = 1.5 and n = 1000.