Twice Class Bias Correction for Imbalanced Semi-supervised Learning
Authors: Lan Li, Bowen Tao, Lu Han, De-chuan Zhan, Han-jia Ye
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experimentation on CIFAR10/100-LT, STL10-LT, and the sizable long-tailed dataset SUN397, we provide conclusive evidence that our proposed TCBC method reliably enhances the performance of class-imbalanced semi-supervised learning. |
| Researcher Affiliation | Academia | Lan Li, Bowen Tao, Lu Han, De-chuan Zhan, Han-jia Ye* National Key Laboratory for Novel Software Technology, Nanjing University School of Artificial Intelligence, Nanjing University Nanjing, 210023, China {lil, taobw, hanlu, yehj}@lamda.nju.edu.cn, zhandc@nju.edu.cn |
| Pseudocode | No | The paper describes its proposed method using mathematical formulations and an illustrative diagram (Figure 2), but it does not include a dedicated pseudocode block or algorithm listing. |
| Open Source Code | Yes | Code and appendix is publicly available at https://github.com/Lain810/TCBC. |
| Open Datasets | Yes | We conduct experiments on three benchmarks including CIFAR10, CIFAR100 (Krizhevsky, Hinton et al. 2009) and STL10 (Coates, Ng, and Lee 2011), which are commonly used in imbalance learning and SSL task. |
| Dataset Splits | Yes | For imbalance types, we adopt long-tailed (LT) imbalance by exponentially decreasing the number of samples from the largest to the smallest class. Following (Lee, Shin, and Kim 2021), we denote the amount of samples of head class in labeled data and unlabeled data as N1 and M1 respectively. The imbalance ratio for the labeled data and unlabeled data is defined as γl and γu, which can vary independently. We have Nk = N1 γlϵk and Mk = M1 γuϵk, where ϵk = k 1 K 1. [...] We measure the top-1 accuracy on test data and finally report the median of accuracy values of the last 20 epochs following Berthelot et al. (2019). |
| Hardware Specification | No | The paper does not specify the hardware used for running experiments, only mentioning the backbone model: |
| Software Dependencies | No | The paper mentions using |
| Experiment Setup | Yes | We train Wide Res Net-28-2 (WRN28-2) on CIFAR10-LT, CIFAR100-LT and STL10-LT as a backbone. We evaluate the performance of TCBC using an EMA network, where parameters are updating via exponential moving average every steps, following Oh, Kim, and Kweon (2022). We measure the top-1 accuracy on test data and finally report the median of accuracy values of the last 20 epochs following Berthelot et al. (2019). Each set of experiments was conducted three times. Additional experimental details are provided in the appendix. |