Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition

Authors: Zi-Hao Zhou, Siyuan Fang, Zi-Jing Zhou, Tong Wei, Yuanyu Wan, Min-Ling Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments across multiple datasets with varying unlabeled data distributions demonstrate that CCL consistently outperforms prior state-of-the-art methods, achieving over 4% improvement on the Image Net-127 dataset.
Researcher Affiliation Collaboration 1School of Computer Science and Engineering, Southeast University, Nanjing, China 2Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, China 3Xiaomi Inc., China 4School of Software Technology, Zhejiang University, Ningbo, China
Pseudocode Yes Algorithm 1 Continous Contrastive Learning (CCL)
Open Source Code Yes Our source code is available at https://github.com/zhouzihao11/CCL.
Open Datasets Yes Our experimental analysis uses a variety of commonly adopted SSL datasets, including CIFAR10-LT [35], CIFAR100-LT [35], STL10-LT [13], and Image Net-127 [22] datasets.
Dataset Splits Yes We adopt imbalance ratios of γl = γu = 100 and γl = γu = 150 for consistent settings, while for uniform and reversed settings, we use γl = 100, γu = 1 and γl = 100, γu = 1/100, respectively.
Hardware Specification Yes In addition, our method is implemented using the Py Torch library and experimented on an NVIDIA RTX A6000 (48 GB VRAM) with an Intel Platinum 8260 (CPU, 2.30GHz, 220 GB RAM).
Software Dependencies No The paper mentions using "Py Torch library" but does not specify a version number for it or any other software dependency.
Experiment Setup Yes Specifically, we apply the Wide Res Net-28-2 [70] architecture to implement our method on the CIFAR10-LT, CIFAR100-LT and STL10-LT datasets; and Res Net-50 on Image Net-127. We adopt the common training paradigm that the network is trained with standard SGD [47, 49, 58] for 500 epochs, where each epoch consists of 500 mini-batches, and a batch size of 64 for both labeled and unlabeled data. We use a cosine learning rate decay [42] where the initial rate is 0.03, we set τ = 2.0 for logit adjustment on all datasets, except for Image Net-127, where τ = 0.1. We set the temperature T = 1 and the threshold ζ = 8.75 for the energy score following [69], and we set λ1 = 0.7, λ2 = 1.0 on CIFAR10/100-LT and λ1 = 0.7, λ2 = 1.5 on STL10-LT and Image Net-127 datasets for the final loss. We set β = 0.2 in Eq. (21) for smoothed pseudo-labels loss.