T-distributed Spherical Feature Representation for Imbalanced Classification
Authors: Xiaoyu Yang, Yufei Chen, Xiaodong Yue, Shaoxun Xu, Chao Ma
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on large-scale imbalanced datasets verify our method, which shows superior results in the long-tailed CIFAR-100/-10 with the imbalanced ratio IR = 100/50. Our method also achieves excellent results on the large-scale Image Net-LT dataset and the i Naturalist dataset with various backbones. In addition, we provide a case study of the real clinical classification of pancreatic tumor subtypes with 6 categories. |
| Researcher Affiliation | Collaboration | Xiaoyu Yang1, Yufei Chen1*, Xiaodong Yue2,3,4, Shaoxun Xu1, Chao Ma5 1 College of Electronics and Information Engineering, Tongji University, Shanghai, China 2 School of Computer Engineering and Science, Shanghai University, Shanghai, China 3 Artificial Intelligence Institute of Shanghai University, Shanghai, China 4 VLN Lab, NAVI Med Tech Co., Ltd. Shanghai, China 5 Department of Radiology, Changhai Hospital of Shanghai, Shanghai, China yufeichen@tongji.edu.cn |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | Our proposed method was evaluated on major public imbalanced datasets and various backbones. The improvements across different tasks show the generalization and feasibility of our method. 1. Long-tailed CIFAR-10/-100: we constructed the longtailed CIFAR-10/-100 dataset following the (Zhou et al. 2020; Yang and Xu 2020)... 2. Image Net-LT (Liu et al. 2019): Image Net-LT is a long-tailed subset from the Image Net-2012 dataset (Russakovsky et al. 2015)... 3. i Naturalist (Cui et al. 2018): i Naturalist is a real-world large-scale dataset for species recognition of animals and plants. |
| Dataset Splits | Yes | 1. Long-tailed CIFAR-10/-100: we constructed the longtailed CIFAR-10/-100 dataset following the (Zhou et al. 2020; Yang and Xu 2020), which consists of approximately 10K 13K training samples and 10K test images. The controllable degrees of imbalanced ratio IR = Nmax Nmin controls the distribution of training sets, where N is the number of samples in each category. In addition, the validation dataset is also set according to the same ratio as the training dataset, meaning the distribution of the training dataset represents the distribution of the real world. 2. Image Net-LT (Liu et al. 2019): ...the category with the largest number has 1280 images and the smallest category has only 5 images in the training dataset. Besides, different from the test and validation dataset in Long-tailed CIFAR-10/-100, each category in the test and validation of Image Net-LT contains the same number of images, which is 50 and 20 respectively. |
| Hardware Specification | No | The paper does not explicitly state the specific hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'stochastic gradient descent (SGD)' as the optimizer but does not provide specific version numbers for software dependencies or libraries used. |
| Experiment Setup | Yes | The Res Net-32 is used as our backbone, as the same as the methods of comparison. And the network is trained by the stochastic gradient descent (SGD) with a momentum of 0.9 for 90 epochs, following the experiment setting of Deconfound-TDE(Tang, Huang, and Zhang 2021). ...We use Res Net-10, Res Net-50 and Res Ne Xt-50 as the backbone, and the network is trained by the SGD optimizer for 300 epochs, with the batch size of 2048. We use Auto-Augmentation and class-aware sampler to perform data augmentation. |