Stochastic Optimization of Areas Under Precision-Recall Curves with Provable Convergence

Authors: Qi Qi, Youzhi Luo, Zhao Xu, Shuiwang Ji, Tianbao Yang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on image and graph datasets demonstrate that our proposed method outperforms prior methods on imbalanced problems in terms of AUPRC. To the best of our knowledge, our work represents the first attempt to optimize AUPRC with provable convergence. We conduct comprehensive empirical studies on class imbalanced graph and image datasets for learning graph neural networks and deep convolutional neural networks, respectively. We demonstrate that the proposed method can consistently outperform prior approaches in terms of AUPRC.
Researcher Affiliation Academia Department of Computer Science, The University of Iowa Department of Computer Science & Engineering, Texas A&M University {qi-qi,tianbao-yang}@uiowa.edu, {yzluo,zhaoxu,sji}@tamu.edu
Pseudocode Yes Algorithm 1: SOAP, Algorithm 2: UG(B, B+, u, wt, γ, u0), Algorithm 3: UW(wt, G(wt))
Open Source Code Yes The SOAP has been implemented in the lib AUC library at https://libauc.org/. The code for reproducing the results is released here [44].
Open Datasets Yes We first conduct experiments on three image datasets: CIFAR10, CIFAR100 and Melanoma dataset [49]... We use the datasets HIV and MUV from the Molecule Net [55].
Dataset Splits Yes And we split the training dataset into train/validation set at 80%/20% ratio. ...we manually split the training data into train/validation/test set at 80%/10%/10% ratio... We use the split of train/validation/test set provided by Molecule Net... We conduct experiments on three random train/validation/test splits at 80%/10%/10% ratio
Hardware Specification No The paper mentions using deep learning models (ResNet, GNNs) but does not specify any particular hardware (e.g., GPU model, CPU, memory) used for running the experiments.
Software Dependencies No The paper mentions software like "lib AUC library" and model architectures like "Res Net18 and Res Net34" or "MPNN, GINE and ML-MPNN" but does not provide specific version numbers for any of these software components or libraries.
Experiment Setup Yes We tune the learning rate in a range {1e-5, 1e-4, 1e-3, 1e-2} and the weight decay parameter in a range {1e-6, 1e-5, 1e-4}... we tune γ of SOAP in a range {0.9, 0.99,0.999}, and tune m in {0.5, 1, 2, 5, 10}... We pre-train the networks by Adam with 100 epochs and a tuned initial learning rate 0.0005, which is decayed by half after 50 epochs... We pre-train GNNs by the Adam method for 100 epochs with a batch size of 64 and a tuned learning rate of 0.0005, which is decayed by half at the 50th epoch.