Contrastive Learning with Boosted Memorization

Authors: Zhihan Zhou, Jiangchao Yao, Yan-Feng Wang, Bo Han, Ya Zhang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on a range of benchmark datasets demonstrate the effectiveness of BCL over several state-of-the-art methods.
Researcher Affiliation Collaboration 1Cooperative Medianet Innovation Center, Shanghai Jiao Tong University 2Shanghai AI Laboratory 3Department of Computer Science, Hong Kong Baptist University.
Pseudocode Yes Algorithm 1 Boosted Contrastive Learning (BCL)
Open Source Code Yes Our code is available at https://github.com/MediaBrain-SJTU/BCL.
Open Datasets Yes We conduct extensive experiments on three benchmark longtailed datasets: CIFAR-100-LT (Cao et al., 2019), Image Net LT (Liu et al., 2019) and Places-LT (Liu et al., 2019) .
Dataset Splits Yes In the default case, we conduct 100-shot evaluation on CIFAR-100-LT, Image Net-LT and Places-LT for performance evaluation. Meanwhile, we also implement the full-shot, 100-shot and 50-shot evaluation for abalation study on CIFAR-100-LT.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments. It only implies computations by discussing training processes like 'train the classifier'.
Software Dependencies No The paper mentions using SGD and Adam optimizers, ResNet architectures, and Rand Augment technique, but does not specify software dependencies like programming language versions (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow) with their specific version numbers, or other libraries.
Experiment Setup Yes For all experiments, we use the SGD optimizer and the cosine annealing schedule. Similar to the backbone architecture and projection head proposed in (Chen et al., 2020a), we use Res Net-18 (He et al., 2016) as the backbone for experiments on CIFAR-100-LT and Res Net-50 on Image Net-LT and Places-LT. The smoothing factor β in the momentum loss Eq. (3) is set as 0.97. Besides, we set k = 1 for BCL-I and k = 2 for BCL-D in the Rand Augment. The whole augmentation set A is aligned with Rand Augment where K = 16. For the other pre-training settings, we follow (Jiang et al., 2021b) and during evaluation, we leverage the way in (Ermolov et al., 2021). Specifically, we train the classifier for 500 epochs and employ the learning rate decaying from 10 2 to 10 6. We use the Adam optimizer with the weight decay 5 10 6.