RSA: Reducing Semantic Shift from Aggressive Augmentations for Self-supervised Learning

Authors: Yingbin Bai, Erkun Yang, Zhaoqing Wang, Yuxuan Du, Bo Han, Cheng Deng, Dadong Wang, Tongliang Liu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our model achieves 73.1% top-1 accuracy on Image Net-1K with Res Net-50 for 200 epochs, which is a 2.5% improvement over BYOL. Moreover, experiments also demonstrate that the learned representations can transfer well for various downstream tasks.
Researcher Affiliation Collaboration Yingbin Bai1 Erkun Yang2 Zhaoqing Wang1 Yuxuan Du3 Bo Han4 Cheng Deng2 Dadong Wang5 Tongliang Liu1 1TML Lab, The University of Sydney; 2Xidian University; 3JD Explore Academy; 4Hong Kong Baptist University; 5CSIRO
Pseudocode Yes Algorithm 1: RSA: Reducing semantic shift
Open Source Code Yes Code is released at: https://github.com/tmllab/RSA.
Open Datasets Yes Datasets: We assess the proposed method on six image datasets, from small to large. We choose CIFAR-10/100 [24] for small datasets, and STL-10 [10] and Tiny Image Net [1] for medium datasets, and Image Net-100 [42] and Image Net-1K [25] for large datasets.
Dataset Splits Yes For STL-10, both 5k labeled and 100k unlabeled images are used for the pre-trained model, and only 5k labeled images are used for the linear evaluation. [...] We evaluate the representations of the pre-trained model with the linear evaluation protocol, which freezes the encoder parameters and trains a linear classifier on top of the pre-trained model.
Hardware Specification Yes Our method and reproduced methods are implemented by Py Torch v1.8 and we conduct all experiments on Nvidia V100. [...] We conduct Image Net-1K experiments with 8 Nvidia V100 32G with Automatic Mixed Precision (AMP) package [33].
Software Dependencies Yes Our method and reproduced methods are implemented by Py Torch v1.8
Experiment Setup Yes For optimization, we use SGD optimizer with a cosine-annealed learning rate of 0.1 [32], a momentum of 0.9, weight decay of 5 10 4, and a batch size of 256. We set βbase = 0.3 for CIFAR-10/100 and βbase = 0.4 for Tiny Image Net and STL-10.