A Theory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning

Authors: Pan Zhou, Caiming Xiong, Xiaotong Yuan, Steven Chu Hong Hoi

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on CIFAR10, Image Net, VOC and COCO show the effectiveness of our method.
Researcher Affiliation Collaboration Salesforce Research Nanjing University of Information Science & Technology
Pseudocode Yes See algorithm details in Algorithm 1 of Appendix A.
Open Source Code Yes Our Pytorch code is available at https://openreview.net/forum?id=P84bif NCp FQ& referrer=%5BAuthor%20Console%5D.
Open Datasets Yes We use standard public datasets, including CIFAR10, Image Net, VOC and COCO which allow researchers to use.
Dataset Splits Yes On VOC, we train detection head with VOC07+12 trainval data and tested on VOC07 test data. On COCO, we train the head on train2017 set and evaluate on the val2017.
Hardware Specification Yes We use one single V100 GPU for training CIFAR10, and 32 GPUs for 800 training epochs on Image Net.
Software Dependencies No The paper mentions 'Pytorch code' but does not specify its version or any other software dependencies with version numbers.
Experiment Setup Yes Settings. We use Res Net50 [49] with a 3-layered MLP head for CIFAR10 [50] and Image Net [21]. We first pretrain SANE, and then train a linear classifier on top of 2048-dimensional frozen features in Res Net50. With dictionary size 4, 096, we pretrain 2, 000 epochs on CIFAR10 instead of 4, 000 epochs of Mo Co, BYOL, and i-Mix in [28]. Dictionary size on Image Net is 65, 536. For linear classifier, we train 200/100 epochs on CIFAR10/Image Net. See all optimizer settings in Appendix A. We use standard data augmentations in [1] for pretraining and test unless otherwise stated. E.g., for test, we perform normalization on CIFAR10, and use center crop and normalization on Image Net. For SANE, we set τ =0.2, τ =0.8, κ=2 in Beta(κ, κ) on CIFAR10, and τ =0.2, τ =1, κ=0.1 on Image Net. For confidence µ, we increase it as µt =m2 (m2 m1)(cos(πt/T)+1)/2 with current iteration t and total training iteration T. We set m1 = 0, m2 = 1 on CIFAR10, and m1 = 0.5, m2 = 10 on Image Net. For KNN on CIFAR10, its neighborhood number is 50 and its temperature is 0.05.