Twice Class Bias Correction for Imbalanced Semi-supervised Learning

Authors: Lan Li, Bowen Tao, Lu Han, De-chuan Zhan, Han-jia Ye

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experimentation on CIFAR10/100-LT, STL10-LT, and the sizable long-tailed dataset SUN397, we provide conclusive evidence that our proposed TCBC method reliably enhances the performance of class-imbalanced semi-supervised learning.
Researcher Affiliation Academia Lan Li, Bowen Tao, Lu Han, De-chuan Zhan, Han-jia Ye* National Key Laboratory for Novel Software Technology, Nanjing University School of Artificial Intelligence, Nanjing University Nanjing, 210023, China {lil, taobw, hanlu, yehj}@lamda.nju.edu.cn, zhandc@nju.edu.cn
Pseudocode No The paper describes its proposed method using mathematical formulations and an illustrative diagram (Figure 2), but it does not include a dedicated pseudocode block or algorithm listing.
Open Source Code Yes Code and appendix is publicly available at https://github.com/Lain810/TCBC.
Open Datasets Yes We conduct experiments on three benchmarks including CIFAR10, CIFAR100 (Krizhevsky, Hinton et al. 2009) and STL10 (Coates, Ng, and Lee 2011), which are commonly used in imbalance learning and SSL task.
Dataset Splits Yes For imbalance types, we adopt long-tailed (LT) imbalance by exponentially decreasing the number of samples from the largest to the smallest class. Following (Lee, Shin, and Kim 2021), we denote the amount of samples of head class in labeled data and unlabeled data as N1 and M1 respectively. The imbalance ratio for the labeled data and unlabeled data is defined as γl and γu, which can vary independently. We have Nk = N1 γlϵk and Mk = M1 γuϵk, where ϵk = k 1 K 1. [...] We measure the top-1 accuracy on test data and finally report the median of accuracy values of the last 20 epochs following Berthelot et al. (2019).
Hardware Specification No The paper does not specify the hardware used for running experiments, only mentioning the backbone model:
Software Dependencies No The paper mentions using
Experiment Setup Yes We train Wide Res Net-28-2 (WRN28-2) on CIFAR10-LT, CIFAR100-LT and STL10-LT as a backbone. We evaluate the performance of TCBC using an EMA network, where parameters are updating via exponential moving average every steps, following Oh, Kim, and Kweon (2022). We measure the top-1 accuracy on test data and finally report the median of accuracy values of the last 20 epochs following Berthelot et al. (2019). Each set of experiments was conducted three times. Additional experimental details are provided in the appendix.