Mosaic Representation Learning for Self-supervised Visual Pre-training
Authors: Zhaoqing Wang, Ziyu Chen, Yaqian Li, Yandong Guo, Jun Yu, Mingming Gong, Tongliang Liu
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate that our method improves the performance far greater than the multi-crop strategy on a series of downstream tasks, e.g., +7.4% and +4.9% than the multi-crop strategy on Image Net-1K with 1% label and 10% label, respectively. |
| Researcher Affiliation | Collaboration | Zhaoqing Wang1,4 Ziyu Chen4 Yaqian Li4 Yandong Guo4 Jun Yu3 Mingming Gong2, Tongliang Liu1, 1 Sydney AI Centre, The University of Sydney 2 The University of Melbourne 3 University of Science and Technology of China 4 OPPO Research Institute |
| Pseudocode | Yes | D PSEUDO-CODES OF MOSAIC AUGMENTATION STRATEGY Algorithm 1: Mosaic augmentation strategy |
| Open Source Code | Yes | Code is available at https://github.com/DerrickWang005/MosRep.git. |
| Open Datasets | Yes | Datasets We perform self-supervised pre-training on two datasets, one middle-scale and another large-scale: 1) 100-category Image Net (IN-100) (Tian et al., 2019), a subset of IN-1K dataset containing 125k images; and 2) 1000-category Image Net (IN-1K) (Deng et al., 2009), the standard Image Net training set containing 1.25M images. |
| Dataset Splits | Yes | We evaluate the pre-trained models on the task of classification with limited Image Net labels. The sizes of annotations are reduced to 1% and 10% on the IN-1K (Deng et al., 2009) training dataset, respectively. ... evaluated on COCO 2017 val set. ... evaluated on Cityscapes val set. |
| Hardware Specification | Yes | with a mini-batch size of 256 on 8 NVIDIA V100 GPU. |
| Software Dependencies | No | The paper does not provide specific software names with version numbers for ancillary software dependencies. |
| Experiment Setup | Yes | As for the Mo Co version, we pre-train the network on IN-100 and IN-1K for 400 and 200 epochs, respectively. SGD (Loshchilov & Hutter, 2016) optimizer with a cosine learning rate scheduler and lrbase = 0.3 is adopted, with a mini-batch size of 256 on 8 NVIDIA V100 GPU. We utilize a negative queue of 16,384 for IN-100, and 65,536 for IN-1K. The weight decay is 0.0001 and SGD momentum is 0.9. |