reproducibilityindex.ai

Balanced Self-Paced Learning for AUC Maximization

Authors: Bin Gu, Chenkang Zhang, Huan Xiong, Heng Huang6765-6773

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Considering both the deep learning and kernel-based implementations, experimental results on several large-scale datasets demonstrate that our BSPAUC has a better generalization performance than existing state-of-the-art AUC maximization methods.
Researcher Affiliation	Academia	1MBZUAI, United Arab Emirates 2School of Computer & Software, Nanjing University of Information Science & Technology, P.R.China 3Institute for Advanced Study in Mathematics, Harbin Institute of Technology, P.R.China 4Department of Electrical & Computer Engineering, University of Pittsburgh, PA, USA
Pseudocode	Yes	Algorithm 1: Balanced self-paced learning for AUC maximization
Open Source Code	No	The paper mentions using "open codes" for other algorithms (KOILF IF O++, PPDSG, OPAUC) and for a poisoning attack method, but does not provide a link or explicit statement about the availability of the source code for their own BSPAUC methodology.
Open Datasets	Yes	The benchmark datasets are obtained from the LIBSVM repository4 which take into account different dimensions and imbalance ratios, as summarized in Table 1. 4Datasets are available at https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets.
Dataset Splits	No	The paper states, "We randomly partition each dataset into 75% for training and 25% for testing," but it does not specify a separate validation dataset split.
Hardware Specification	Yes	All the experiments are conducted on a PC with 48 2.2GHz cores, 80GB RAM and 4 Nvidia 1080ti GPUs
Software Dependencies	No	The paper mentions that implementations were done in "python" but does not specify version numbers for Python itself or any specific libraries/packages used.
Experiment Setup	Yes	For all kernel-based methods, we use Gaussian kernel k(x, x ) = exp ( \|\|x x \|\|2 / 2σ2 ) and tune its hyper-parameter σ 2[ 5,5] by a 5-fold cross-validation. For TSAM method and our Algorithm 3 (in Appendix), the number of random Fourier features is selected from [500 : 500 : 4000]. For KOILF IF O++ method, the buffer size is set at 100 for each class. For all deep learning methods, we utilize the same network structure which consists of eight full connection layers and uses the Re Lu activation function. For PPDSG method, the initial stage is tuned from 200 to 2000. For our BSPAUC, the hyper-parameters are chosen according to the proportion of selected samples. Specifically, we start training with about 50% samples, and then linearly increase λ to include more samples.