Quadruply Stochastic Gradients for Large Scale Nonlinear Semi-Supervised AUC Optimization

Authors: Wanli Shi, Bin Gu, Xiang Li, Xiang Geng, Heng Huang

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on a variety of benchmark datasets show that QSG-S2AUC is far more efficient than the existing state-of-the-art algorithms for semi-supervised AUC maximization, while retaining the similar generalization performance. In this section, we present the experimental results on several datasets to demonstrate the effectiveness and efficiency of QSG-S2AUC.
Researcher Affiliation Collaboration Wanli Shi1 , Bin Gu1,2 , Xiang Li3 , Xiang Geng1 and Heng Huang2,4 1School of Computer & Software, Nanjing University of Information Science & Technology, P.R.China 2JD Finance America Corporation 3Computer Science Department, University of Western Ontario, Canada 4Department of Electrical & Computer Engineering, University of Pittsburgh, USA
Pseudocode Yes Algorithm 1 {αi}t i=1 = QSG-S2AUC(Dp, Dn, p(x))
Open Source Code No The paper mentions using 'the MATLAB code from https://github.com/t-sakai-kure/PNU as the implementation of PNU-AUC', which is for a baseline method. It does not provide concrete access to the source code for the proposed QSG-S2AUC algorithm.
Open Datasets Yes We carry out the experiments on eight large scale benchmark datasets collected from LIBSVM2 and UCI3 repositories. 2LIBSVM is available at https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/. 3UCI is available at http://archive.ics.uci.edu/ml/datasets.html.
Dataset Splits Yes The hyper-parameters (λ, σ and γ) are chosen via 5-fold cross-validation.
Hardware Specification Yes All the experiments were ran on a PC with 56 2.2GHz cores and 80GB RAM.
Software Dependencies No The paper states 'We implemented QSG-S2AUC and SAMULT algorithms in MATLAB' but does not provide specific version numbers for MATLAB or any other software dependencies.
Experiment Setup Yes For all algorithms, we use the square pairwise loss l(u, v) = (1 u + v)2 and Gaussian kernel k(x, x ) = exp( σ x x 2). The hyper-parameters (λ, σ and γ) are chosen via 5-fold cross-validation. λ and σ were searched in the region {(λ, σ)|2 3 λ 23, 2 3 σ 23}. The trade-off parameter γ in SAMULT and QSGS2AUC was searched from 0 to 1 at intervals of 0.1, and that in PNU-AUC was searched from 1 to 1 at intervals of 0.1. In addition, the class prior π in PNU-AUC is set to the class proportion in the whole training set, which can be estimated by [du Plessis et al., 2015]. All the results are the average of 10 trials.