Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition
Authors: Zi-Hao Zhou, Siyuan Fang, Zi-Jing Zhou, Tong Wei, Yuanyu Wan, Min-Ling Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across multiple datasets with varying unlabeled data distributions demonstrate that CCL consistently outperforms prior state-of-the-art methods, achieving over 4% improvement on the Image Net-127 dataset. |
| Researcher Affiliation | Collaboration | 1School of Computer Science and Engineering, Southeast University, Nanjing, China 2Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, China 3Xiaomi Inc., China 4School of Software Technology, Zhejiang University, Ningbo, China |
| Pseudocode | Yes | Algorithm 1 Continous Contrastive Learning (CCL) |
| Open Source Code | Yes | Our source code is available at https://github.com/zhouzihao11/CCL. |
| Open Datasets | Yes | Our experimental analysis uses a variety of commonly adopted SSL datasets, including CIFAR10-LT [35], CIFAR100-LT [35], STL10-LT [13], and Image Net-127 [22] datasets. |
| Dataset Splits | Yes | We adopt imbalance ratios of γl = γu = 100 and γl = γu = 150 for consistent settings, while for uniform and reversed settings, we use γl = 100, γu = 1 and γl = 100, γu = 1/100, respectively. |
| Hardware Specification | Yes | In addition, our method is implemented using the Py Torch library and experimented on an NVIDIA RTX A6000 (48 GB VRAM) with an Intel Platinum 8260 (CPU, 2.30GHz, 220 GB RAM). |
| Software Dependencies | No | The paper mentions using "Py Torch library" but does not specify a version number for it or any other software dependency. |
| Experiment Setup | Yes | Specifically, we apply the Wide Res Net-28-2 [70] architecture to implement our method on the CIFAR10-LT, CIFAR100-LT and STL10-LT datasets; and Res Net-50 on Image Net-127. We adopt the common training paradigm that the network is trained with standard SGD [47, 49, 58] for 500 epochs, where each epoch consists of 500 mini-batches, and a batch size of 64 for both labeled and unlabeled data. We use a cosine learning rate decay [42] where the initial rate is 0.03, we set τ = 2.0 for logit adjustment on all datasets, except for Image Net-127, where τ = 0.1. We set the temperature T = 1 and the threshold ζ = 8.75 for the energy score following [69], and we set λ1 = 0.7, λ2 = 1.0 on CIFAR10/100-LT and λ1 = 0.7, λ2 = 1.5 on STL10-LT and Image Net-127 datasets for the final loss. We set β = 0.2 in Eq. (21) for smoothed pseudo-labels loss. |