RSA: Reducing Semantic Shift from Aggressive Augmentations for Self-supervised Learning
Authors: Yingbin Bai, Erkun Yang, Zhaoqing Wang, Yuxuan Du, Bo Han, Cheng Deng, Dadong Wang, Tongliang Liu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our model achieves 73.1% top-1 accuracy on Image Net-1K with Res Net-50 for 200 epochs, which is a 2.5% improvement over BYOL. Moreover, experiments also demonstrate that the learned representations can transfer well for various downstream tasks. |
| Researcher Affiliation | Collaboration | Yingbin Bai1 Erkun Yang2 Zhaoqing Wang1 Yuxuan Du3 Bo Han4 Cheng Deng2 Dadong Wang5 Tongliang Liu1 1TML Lab, The University of Sydney; 2Xidian University; 3JD Explore Academy; 4Hong Kong Baptist University; 5CSIRO |
| Pseudocode | Yes | Algorithm 1: RSA: Reducing semantic shift |
| Open Source Code | Yes | Code is released at: https://github.com/tmllab/RSA. |
| Open Datasets | Yes | Datasets: We assess the proposed method on six image datasets, from small to large. We choose CIFAR-10/100 [24] for small datasets, and STL-10 [10] and Tiny Image Net [1] for medium datasets, and Image Net-100 [42] and Image Net-1K [25] for large datasets. |
| Dataset Splits | Yes | For STL-10, both 5k labeled and 100k unlabeled images are used for the pre-trained model, and only 5k labeled images are used for the linear evaluation. [...] We evaluate the representations of the pre-trained model with the linear evaluation protocol, which freezes the encoder parameters and trains a linear classifier on top of the pre-trained model. |
| Hardware Specification | Yes | Our method and reproduced methods are implemented by Py Torch v1.8 and we conduct all experiments on Nvidia V100. [...] We conduct Image Net-1K experiments with 8 Nvidia V100 32G with Automatic Mixed Precision (AMP) package [33]. |
| Software Dependencies | Yes | Our method and reproduced methods are implemented by Py Torch v1.8 |
| Experiment Setup | Yes | For optimization, we use SGD optimizer with a cosine-annealed learning rate of 0.1 [32], a momentum of 0.9, weight decay of 5 10 4, and a batch size of 256. We set βbase = 0.3 for CIFAR-10/100 and βbase = 0.4 for Tiny Image Net and STL-10. |