Distribution Alignment Optimization through Neural Collapse for Long-tailed Classification

Authors: Jintong Gao, He Zhao, Dan Dan Guo, Hongyuan Zha

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The extensive experiments show the effectiveness of Dis A, providing a promising solution to the imbalanced issue. To evaluate the effectiveness of our method, we conduct experiments on benchmark datasets for longtailed classification, including CIFAR-LT-10 (Cui et al., 2019), CIFAR-LT-100 (Cui et al., 2019), and Image Net-LT (Deng et al., 2009).
Researcher Affiliation Academia 1School of Artificial Intelligence, Jilin University 2CSIRO s Data61 3The Chinese University of Hong Kong, Shenzhen.
Pseudocode Yes We summarize the complete procedure of our Dis A method in Algorithm 2. Algorithm 1 Distribution Alignment Optimization
Open Source Code No Our code is available at Dis A.
Open Datasets Yes To evaluate the effectiveness of our method, we conduct experiments on benchmark datasets for longtailed classification, including CIFAR-LT-10 (Cui et al., 2019), CIFAR-LT-100 (Cui et al., 2019), and Image Net-LT (Deng et al., 2009).
Dataset Splits No Let D = {(xi, yi)}N i=1 be the training set for a multi-class imbalanced classification problem with K classes... We train 200 epochs with the batchsize of 128...
Hardware Specification Yes In CIFAR-LT-10 and CIFAR-LT-100, we use Res Net-32 (He et al., 2016) as the backbone and use 200 epochs on a single Tesla A10 GPU and set the initial learning rate as 0.1, which is divided by 10 at 160th and 180th epochs. We train 200 epochs with the batchsize of 128 and weight decay of 5e-4 on four Tesla A10 GPUs.
Software Dependencies No For all experiments, our method is implemented in Py Torch and using an SGD optimizer with a momentum of 0.9.
Experiment Setup Yes For all experiments, our method is implemented in Py Torch and using an SGD optimizer with a momentum of 0.9. In CIFAR-LT-10 and CIFAR-LT-100, we use Res Net-32 (He et al., 2016) as the backbone and use 200 epochs on a single Tesla A10 GPU and set the initial learning rate as 0.1, which is divided by 10 at 160th and 180th epochs. We train 200 epochs with the batchsize of 128 and weight decay of 5e-4 on four Tesla A10 GPUs. The learning rate is initialized as 0.1 and decays to zeros by cosine annealing schedule during training. We set λ for regularization weight in (10) as 0.1 and ϵ for entropic regularization in (9) as 1.