Large-scale Optimization of Partial AUC in a Range of False Positive Rates

Authors: Yao Yao, Qihang Lin, Tianbao Yang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we numerically demonstrated the effectiveness of our proposed algorithms in training both linear models and deep neural networks for partial AUC maximization and sum of ranked range loss minimization. and In this section, we demonstrate the effectiveness of our algorithm AGD-SBCD for p AUC maximization and So RR loss minimization problems (see Appendix E.1 for details). All experiments are conducted in Python and Matlab on a computer with the CPU 2GHz Quad-Core Intel Core i5 and the GPU NVIDIA Ge Force RTX 2080 Ti. All datasets we used are publicly available and contain no personally identifiable information and offensive contents.
Researcher Affiliation Academia Yao Yao Department of Mathematics The University of Iowa yao-yao-2@uiowa.edu Qihang Lin Tipple College of Business The University of Iowa qihang-lin@uiowa.edu Tianbao Yang Department of Computer Science & Engineering Texas A&M University tianbao-yang@tamu.edu
Pseudocode Yes Algorithm 1 Stochastic Block Coordinate Descent for (13): ( v, λ) =SBCD(w, λ, T, µ, l) and Algorithm 2 Approximate Gradient Descent (AGD) for (10)
Open Source Code Yes For all authors...(a) Do the main claims made in the abstract and introduction accurately reflect the paper s contributions and scope? [Yes] See Algorithm 1 and 2 and Section 6. ... (3) If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See Section 6.
Open Datasets Yes For maximizing p AUC, we focus on large-scale imbalanced medical dataset Che Xpert [25], which is licensed under CC-BY-SA and has 224,316 images. and All datasets we used are publicly available and contain no personally identifiable information and offensive contents.
Dataset Splits Yes We learn the model Dense Net121 from scratch with the Che Xpert training data split in train/val=9:1 and the Che Xpert validation dataset as the testing set, which has 234 samples.
Hardware Specification Yes All experiments are conducted in Python and Matlab on a computer with the CPU 2GHz Quad-Core Intel Core i5 and the GPU NVIDIA Ge Force RTX 2080 Ti.
Software Dependencies No All experiments are conducted in Python and Matlab. No specific version numbers for these software packages or any other libraries are provided.
Experiment Setup Yes For optimizing CE, we use the standard Adam optimizer. For optimizing AUC-M, we use the PESG optimizer in [71]. We run each method 10 epochs and the learning rate (c in AGD-SBCD) of all methods is tuned from {10 5 100}. The minibatch size is 32. For AGD-SBCD, Tk is set to 50(k + 1)2, µ is set to 103 N+N and γ is tuned from {0.1, 1, 2} 103/(N+N ).