Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
On the consistent estimation of optimal Receiver Operating Characteristic (ROC) curve
Authors: Renxiong Liu, Yunzhang Zhu
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we examine the operating characteristics of the three methods analyzed in this article with both simulation studies and a real data example. In our simulation, we investigate the effect of model mis-specification by considering linear classifiers over two simulated data sets. One dataset is generated under a linear discriminant analysis (LDA) setting, which imitates the scenario where model is correctly specified. The other is based on a quadratic discriminant analysis (QDA) setting, which is used to show the differences across three methods under model mis-specification. We also compare the three methods by using a bank marketing data set [Moro et al., 2014], which allows us to investigate the effect model mis-specification on the performance of ROC curve estimation methods in real problem setting. Throughout all the experiments, we consider linear classification methods including constrained ψ-learning, weighted SVM and its cutoff version, and also typical kernel methods that include kernel weighted SVM and its cutoff version |
| Researcher Affiliation | Academia | Renxiong Liu Department of Statistics Ohio State University Columbus, OH 43210 EMAIL Yunzhang Zhu Department of Statistics Ohio State University Columbus, OH 43210 EMAIL |
| Pseudocode | No | No pseudocode or algorithm blocks found in the paper. |
| Open Source Code | No | [No] We will release our codes later. |
| Open Datasets | Yes | Example 3 (Real data example). This example considers a bank marketing dataset [Moro et al., 2014], which records the direct marketing campaigns of a Portuguese banking institution and are available at https://archive.ics.uci.edu/ml/datasets/Bank+Marketing. |
| Dataset Splits | No | No explicit train/validation/test dataset splits by percentage or count are provided. The paper mentions |
| Hardware Specification | No | No specific hardware details (GPU/CPU models, memory, etc.) for running experiments are provided. |
| Software Dependencies | No | No specific ancillary software details with version numbers are provided. |
| Experiment Setup | Yes | For all three examples, to generate the estimated ROC curve we vary weight w and the constraint upper bound α from {i/500 | i = 0, 1, . . . , 500} for the weighted method and the constrained method, respectively. For the two cutoff methods, we choose w = 1/2. |