Balanced Self-Paced Learning for AUC Maximization
Authors: Bin Gu, Chenkang Zhang, Huan Xiong, Heng Huang6765-6773
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Considering both the deep learning and kernel-based implementations, experimental results on several large-scale datasets demonstrate that our BSPAUC has a better generalization performance than existing state-of-the-art AUC maximization methods. |
| Researcher Affiliation | Academia | 1MBZUAI, United Arab Emirates 2School of Computer & Software, Nanjing University of Information Science & Technology, P.R.China 3Institute for Advanced Study in Mathematics, Harbin Institute of Technology, P.R.China 4Department of Electrical & Computer Engineering, University of Pittsburgh, PA, USA |
| Pseudocode | Yes | Algorithm 1: Balanced self-paced learning for AUC maximization |
| Open Source Code | No | The paper mentions using "open codes" for other algorithms (KOILF IF O++, PPDSG, OPAUC) and for a poisoning attack method, but does not provide a link or explicit statement about the availability of the source code for their own BSPAUC methodology. |
| Open Datasets | Yes | The benchmark datasets are obtained from the LIBSVM repository4 which take into account different dimensions and imbalance ratios, as summarized in Table 1. 4Datasets are available at https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets. |
| Dataset Splits | No | The paper states, "We randomly partition each dataset into 75% for training and 25% for testing," but it does not specify a separate validation dataset split. |
| Hardware Specification | Yes | All the experiments are conducted on a PC with 48 2.2GHz cores, 80GB RAM and 4 Nvidia 1080ti GPUs |
| Software Dependencies | No | The paper mentions that implementations were done in "python" but does not specify version numbers for Python itself or any specific libraries/packages used. |
| Experiment Setup | Yes | For all kernel-based methods, we use Gaussian kernel k(x, x ) = exp ( ||x x ||2 / 2σ2 ) and tune its hyper-parameter σ 2[ 5,5] by a 5-fold cross-validation. For TSAM method and our Algorithm 3 (in Appendix), the number of random Fourier features is selected from [500 : 500 : 4000]. For KOILF IF O++ method, the buffer size is set at 100 for each class. For all deep learning methods, we utilize the same network structure which consists of eight full connection layers and uses the Re Lu activation function. For PPDSG method, the initial stage is tuned from 200 to 2000. For our BSPAUC, the hyper-parameters are chosen according to the proportion of selected samples. Specifically, we start training with about 50% samples, and then linearly increase λ to include more samples. |