Trainable Undersampling for Class-Imbalance Learning

Authors: Minlong Peng, Qi Zhang, Xiaoyu Xing, Tao Gui, Xuanjing Huang, Yu-Gang Jiang, Keyu Ding, Zhigang Chen4707-4714

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on both synthetic and realistic datasets demonstrate the effectiveness of the proposed method.
Researcher Affiliation Collaboration Minlong Peng,1 Qi Zhang,1 Xiaoyu Xing,1 Tao Gui,1 Xuanjing Huang,1 Yu-Gang Jiang,1 Keyu Ding,2 Zhigang Chen2 School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai Insitute of Intelligent Electroics & Systems, Shanghai, China 1{mlpeng16, qz, xyxing14, tgui16, xjhuang, ygj}@fudan.edu.cn 2{kyding, zgcheng}@iflytek.com
Pseudocode Yes Algorithm 1 Trainable Undersampling
Open Source Code No The paper does not provide a direct statement or link to the open-source code for the methodology described in this paper.
Open Datasets Yes Credit Fraud contains transactions made by credit cards in September 2013 by european cardholders (Dal Pozzolo et al. 2015). It is highly imbalanced, with only 492 frauds out of 284,807 transactions. ... Diabetic Retinopathy (DR) is an imbalanced version of the Diabetic Retinopathy Detection 1, where the negative examples belong to class 0 (No DR) and the positive examples belong to the rest. Following the work of (Leibig et al. 2017), we used the AUCROC to measure the performance. 1You can get more information about this dataset in the Kaggle Challenge. https://www.kaggle.com/c/diabetic-retinopathydetection
Dataset Splits Yes Based on these assumptions, we first chose the supervised classifier and its corresponding hyperparameters for each tested task with 5-fold cross-validation on the original training dataset.
Hardware Specification No Figure 2 depicts the training process of the proposed method by time (second) on three tested datasets using a single GPU. No specific GPU model or other detailed hardware specifications are provided.
Software Dependencies No The paper mentions using Pytorch, sklearn, and imbalancedlearn, but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes As for the hyperparameters of the sampling strategies themselves, such as the sampling probability of each class, we chose the values that maximized the best performance over 20 random runs. We implemented the GRU network with 25 hidden units and the MLP with one-layer-perceptron using Pytorch, and we used the Rms Prop (Tieleman and Hinton 2012) step rule for parameter optimization with its initial learning rate set to 0.001. For the Credit Fraud task, we first trained the proposed data sampler on a smaller training dataset, containing all (denoted by n) of the positive examples and 10n negative examples. Then, for every 200 iterations, we added additional 10n more negative examples to the subset until all of the data were used. We implemented the GRU network with 50 hidden units for this task, while with 25 units for the other tasks. For the SMS Spam task, we reduced the input dimension to 100 with PCA for the policy network.