Self-Damaging Contrastive Learning

Authors: Ziyu Jiang, Tianlong Chen, Bobak J Mortazavi, Zhangyang Wang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments across multiple datasets and imbalance settings show that SDCLR significantly improves not only overall accuracies but also balancedness, in terms of linear evaluation on the full-shot and fewshot settings.
Researcher Affiliation Academia 1Texas A&M University 2University of Texas at Austin.
Pseudocode No The paper describes the workflow of the proposed SDCLR framework (Figure 1) and its components, but it does not include a dedicated pseudocode or algorithm block.
Open Source Code Yes Our code is available at https: //github.com/VITA-Group/SDCLR.
Open Datasets Yes Our experiments are based on three popular imbalanced datasets at varying scales: long-tail CIFAR-10, long-tail CIFAR-100 and Image Net-LT. Besides, to further stretch out contrastive learning s imbalance handling ability, we also consider a more realistic and more challenging benchmark long-tail Image Net-100 as well as another long tail Image Net with a different exponential sampling rule. Table 7. Dataset downloading links Dataset Link Image Net http://image-net.org/download CIFAR10 https://www.cs.toronto.edu/ kriz/cifar-10-python.tar.gz CIFAR100 https://www.cs.toronto.edu/ kriz/cifar-100-python.tar.gz
Dataset Splits Yes We also randomly select [10000, 20000, 2000] samples from the official training datasets of [CIFAR10/CIFAR100, Image Net, Image Net-100] as validation datasets, respectively.
Hardware Specification Yes Our codes are based on Pytorch (Paszke et al., 2017), and all models are trained with Ge Force RTX 2080 Ti and NVIDIA Quadro RTX 8000.
Software Dependencies No The paper states, "Our codes are based on Pytorch (Paszke et al., 2017)," but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup Yes The default pruning ratio is 90% for CIFAR and 30% for Image Net. We employ SGD with momentum 0.9 as the optimizer for all fine-tuning. We follow (Chen et al., 2020c) employing learning rate of 30 and remove the weight decay for all fine-tuning. When fine-tuning for linear separability performance, we train for 30 epochs and decrease the learning rate by 10 times at epochs 10 and 20. However, when fine-tuning for few-shot performance, we would train for 100 epochs and decrease the learning rate at epoch 40 and 60. On the full dataset of CIFAR10/CIFAR100, we pre-train for 1000 epochs. In contrast, on sub-sampled CIFAR10/CIFAR100, we would enlarge the pre-training epochs number to 2000 given the dataset size is small. Moreover, the pre-training epochs of Image Net-LT-exp/Image Net-100-LT is set as 500.