On the consistent estimation of optimal Receiver Operating Characteristic (ROC) curve
Authors: Renxiong Liu, Yunzhang Zhu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we examine the operating characteristics of the three methods analyzed in this article with both simulation studies and a real data example. In our simulation, we investigate the effect of model mis-specification by considering linear classifiers over two simulated data sets. One dataset is generated under a linear discriminant analysis (LDA) setting, which imitates the scenario where model is correctly specified. The other is based on a quadratic discriminant analysis (QDA) setting, which is used to show the differences across three methods under model mis-specification. We also compare the three methods by using a bank marketing data set [Moro et al., 2014], which allows us to investigate the effect model mis-specification on the performance of ROC curve estimation methods in real problem setting. Throughout all the experiments, we consider linear classification methods including constrained ψ-learning, weighted SVM and its cutoff version, and also typical kernel methods that include kernel weighted SVM and its cutoff version |
| Researcher Affiliation | Academia | Renxiong Liu Department of Statistics Ohio State University Columbus, OH 43210 liu.6732@buckeyemail.osu.edu Yunzhang Zhu Department of Statistics Ohio State University Columbus, OH 43210 zhu.219@osu.edu |
| Pseudocode | No | No pseudocode or algorithm blocks found in the paper. |
| Open Source Code | No | [No] We will release our codes later. |
| Open Datasets | Yes | Example 3 (Real data example). This example considers a bank marketing dataset [Moro et al., 2014], which records the direct marketing campaigns of a Portuguese banking institution and are available at https://archive.ics.uci.edu/ml/datasets/Bank+Marketing. |
| Dataset Splits | No | No explicit train/validation/test dataset splits by percentage or count are provided. The paper mentions |
| Hardware Specification | No | No specific hardware details (GPU/CPU models, memory, etc.) for running experiments are provided. |
| Software Dependencies | No | No specific ancillary software details with version numbers are provided. |
| Experiment Setup | Yes | For all three examples, to generate the estimated ROC curve we vary weight w and the constraint upper bound α from {i/500 | i = 0, 1, . . . , 500} for the weighted method and the constrained method, respectively. For the two cutoff methods, we choose w = 1/2. |