Quadruply Stochastic Gradients for Large Scale Nonlinear Semi-Supervised AUC Optimization
Authors: Wanli Shi, Bin Gu, Xiang Li, Xiang Geng, Heng Huang
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on a variety of benchmark datasets show that QSG-S2AUC is far more efficient than the existing state-of-the-art algorithms for semi-supervised AUC maximization, while retaining the similar generalization performance. In this section, we present the experimental results on several datasets to demonstrate the effectiveness and efficiency of QSG-S2AUC. |
| Researcher Affiliation | Collaboration | Wanli Shi1 , Bin Gu1,2 , Xiang Li3 , Xiang Geng1 and Heng Huang2,4 1School of Computer & Software, Nanjing University of Information Science & Technology, P.R.China 2JD Finance America Corporation 3Computer Science Department, University of Western Ontario, Canada 4Department of Electrical & Computer Engineering, University of Pittsburgh, USA |
| Pseudocode | Yes | Algorithm 1 {αi}t i=1 = QSG-S2AUC(Dp, Dn, p(x)) |
| Open Source Code | No | The paper mentions using 'the MATLAB code from https://github.com/t-sakai-kure/PNU as the implementation of PNU-AUC', which is for a baseline method. It does not provide concrete access to the source code for the proposed QSG-S2AUC algorithm. |
| Open Datasets | Yes | We carry out the experiments on eight large scale benchmark datasets collected from LIBSVM2 and UCI3 repositories. 2LIBSVM is available at https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/. 3UCI is available at http://archive.ics.uci.edu/ml/datasets.html. |
| Dataset Splits | Yes | The hyper-parameters (λ, σ and γ) are chosen via 5-fold cross-validation. |
| Hardware Specification | Yes | All the experiments were ran on a PC with 56 2.2GHz cores and 80GB RAM. |
| Software Dependencies | No | The paper states 'We implemented QSG-S2AUC and SAMULT algorithms in MATLAB' but does not provide specific version numbers for MATLAB or any other software dependencies. |
| Experiment Setup | Yes | For all algorithms, we use the square pairwise loss l(u, v) = (1 u + v)2 and Gaussian kernel k(x, x ) = exp( σ x x 2). The hyper-parameters (λ, σ and γ) are chosen via 5-fold cross-validation. λ and σ were searched in the region {(λ, σ)|2 3 λ 23, 2 3 σ 23}. The trade-off parameter γ in SAMULT and QSGS2AUC was searched from 0 to 1 at intervals of 0.1, and that in PNU-AUC was searched from 1 to 1 at intervals of 0.1. In addition, the class prior π in PNU-AUC is set to the class proportion in the whole training set, which can be estimated by [du Plessis et al., 2015]. All the results are the average of 10 trials. |