Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On the consistent estimation of optimal Receiver Operating Characteristic (ROC) curve

Authors: Renxiong Liu, Yunzhang Zhu

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we examine the operating characteristics of the three methods analyzed in this article with both simulation studies and a real data example. In our simulation, we investigate the effect of model mis-speciﬁcation by considering linear classiﬁers over two simulated data sets. One dataset is generated under a linear discriminant analysis (LDA) setting, which imitates the scenario where model is correctly speciﬁed. The other is based on a quadratic discriminant analysis (QDA) setting, which is used to show the differences across three methods under model mis-speciﬁcation. We also compare the three methods by using a bank marketing data set [Moro et al., 2014], which allows us to investigate the effect model mis-speciﬁcation on the performance of ROC curve estimation methods in real problem setting. Throughout all the experiments, we consider linear classiﬁcation methods including constrained ψ-learning, weighted SVM and its cutoff version, and also typical kernel methods that include kernel weighted SVM and its cutoff version
Researcher Affiliation	Academia	Renxiong Liu Department of Statistics Ohio State University Columbus, OH 43210 EMAIL Yunzhang Zhu Department of Statistics Ohio State University Columbus, OH 43210 EMAIL
Pseudocode	No	No pseudocode or algorithm blocks found in the paper.
Open Source Code	No	[No] We will release our codes later.
Open Datasets	Yes	Example 3 (Real data example). This example considers a bank marketing dataset [Moro et al., 2014], which records the direct marketing campaigns of a Portuguese banking institution and are available at https://archive.ics.uci.edu/ml/datasets/Bank+Marketing.
Dataset Splits	No	No explicit train/validation/test dataset splits by percentage or count are provided. The paper mentions
Hardware Specification	No	No specific hardware details (GPU/CPU models, memory, etc.) for running experiments are provided.
Software Dependencies	No	No specific ancillary software details with version numbers are provided.
Experiment Setup	Yes	For all three examples, to generate the estimated ROC curve we vary weight w and the constraint upper bound α from {i/500 \| i = 0, 1, . . . , 500} for the weighted method and the constrained method, respectively. For the two cutoff methods, we choose w = 1/2.