MESA: Boost Ensemble Imbalanced Learning with MEta-SAmpler
Authors: Zhining Liu, Pengfei Wei, Jing Jiang, Wei Cao, Jiang Bian, Yi Chang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on both synthetic and realworld tasks demonstrate the effectiveness, robustness, and transferability of MESA. Our code is available at https://github.com/Zhining Liu1998/mesa. 4 Experiments To thoroughly assess the effectiveness of MESA, two series of experiments are conducted: one on controlled synthetic toy datasets for visualization and the other on real-world imbalanced datasets to validate MESA s performance in practical applications. |
| Researcher Affiliation | Collaboration | Zhining Liu Jilin University znliu19@mails.jlu.edu.cn Pengfei Wei National University of Singapore dcsweip@nus.edu.sg Jing Jiang University of Technology Sydney jing.jiang@uts.edu.au Wei Cao Microsoft Research weicao@microsoft.com Jiang Bian Microsoft Research jiang.bian@microsoft.com Yi Chang Jilin University yichang@jlu.edu.cn |
| Pseudocode | Yes | Algorithm 1 Sample(Dτ; F, µ, σ) Algorithm 2 Ensemble training in MESA Algorithm 3 Meta-training in MESA |
| Open Source Code | Yes | Our code is available at https://github.com/Zhining Liu1998/mesa. |
| Open Datasets | Yes | We extend the experiments to real-world imbalanced classification tasks from the UCI repository [10] and KDD CUP 2004. For each dataset, we keep-out the 20% validation set and report the result of 4-fold stratified cross-validation (i.e., 60%/20%/20% training/validation/test split). |
| Dataset Splits | Yes | For each dataset, we keep-out the 20% validation set and report the result of 4-fold stratified cross-validation (i.e., 60%/20%/20% training/validation/test split). |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | Setup Details. We build a series of imbalanced toy datasets corresponding to different levels of underlying class distribution overlapping, as shown in Fig. 3. All the datasets have the same imbalance ratio (|N|/|P| = 2, 000/200 = 10). In this experiment, MESA is compared with four representative EIL algorithms from 4 major EIL branches (Parallel/Iterative Ensemble + Under/Over-sampling), i.e., SMOTEBOOST [7], SMOTEBAGGING [42], RUSBOOST [35], and UNDERBAGGING [2]. All EIL methods are deployed with decision trees as base classifiers with ensemble size of 5. |