Stochastic Optimization of Areas Under Precision-Recall Curves with Provable Convergence
Authors: Qi Qi, Youzhi Luo, Zhao Xu, Shuiwang Ji, Tianbao Yang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on image and graph datasets demonstrate that our proposed method outperforms prior methods on imbalanced problems in terms of AUPRC. To the best of our knowledge, our work represents the first attempt to optimize AUPRC with provable convergence. We conduct comprehensive empirical studies on class imbalanced graph and image datasets for learning graph neural networks and deep convolutional neural networks, respectively. We demonstrate that the proposed method can consistently outperform prior approaches in terms of AUPRC. |
| Researcher Affiliation | Academia | Department of Computer Science, The University of Iowa Department of Computer Science & Engineering, Texas A&M University {qi-qi,tianbao-yang}@uiowa.edu, {yzluo,zhaoxu,sji}@tamu.edu |
| Pseudocode | Yes | Algorithm 1: SOAP, Algorithm 2: UG(B, B+, u, wt, γ, u0), Algorithm 3: UW(wt, G(wt)) |
| Open Source Code | Yes | The SOAP has been implemented in the lib AUC library at https://libauc.org/. The code for reproducing the results is released here [44]. |
| Open Datasets | Yes | We first conduct experiments on three image datasets: CIFAR10, CIFAR100 and Melanoma dataset [49]... We use the datasets HIV and MUV from the Molecule Net [55]. |
| Dataset Splits | Yes | And we split the training dataset into train/validation set at 80%/20% ratio. ...we manually split the training data into train/validation/test set at 80%/10%/10% ratio... We use the split of train/validation/test set provided by Molecule Net... We conduct experiments on three random train/validation/test splits at 80%/10%/10% ratio |
| Hardware Specification | No | The paper mentions using deep learning models (ResNet, GNNs) but does not specify any particular hardware (e.g., GPU model, CPU, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software like "lib AUC library" and model architectures like "Res Net18 and Res Net34" or "MPNN, GINE and ML-MPNN" but does not provide specific version numbers for any of these software components or libraries. |
| Experiment Setup | Yes | We tune the learning rate in a range {1e-5, 1e-4, 1e-3, 1e-2} and the weight decay parameter in a range {1e-6, 1e-5, 1e-4}... we tune γ of SOAP in a range {0.9, 0.99,0.999}, and tune m in {0.5, 1, 2, 5, 10}... We pre-train the networks by Adam with 100 epochs and a tuned initial learning rate 0.0005, which is decayed by half after 50 epochs... We pre-train GNNs by the Adam method for 100 epochs with a batch size of 64 and a tuned learning rate of 0.0005, which is decayed by half at the 50th epoch. |