Self-Supervised Learning via Maximum Entropy Coding

Authors: Xin Liu, Zhongdao Wang, Ya-Li Li, Shengjin Wang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that MEC learns a more generalizable representation than previous methods based on specific pretext tasks. It achieves state-of-the-art performance consistently on various downstream tasks, including not only Image Net linear probe, but also semi-supervised classification, object detection, instance segmentation, and object tracking.
Researcher Affiliation Academia Xin Liu Zhongdao Wang Yali Li Shengjin Wang Beijing National Research Center for Information Science and Technology (BNRist) Department of Electronic Engineering, Tsinghua University {xinliu20, wcd17}@mails.tsinghua.edu.cn {liyali13, wgsgj}@tsinghua.edu.cn
Pseudocode Yes An overview of MEC is illustrated in Figure 2 and a Py Torch-like pseudocode is provided in Appendix A.
Open Source Code Yes Code and pre-trained models are available at https://github.com/xinliu20/MEC.
Open Datasets Yes We perform self-supervised pre-training using the proposed MEC on the training set of the Image Net ILSVRC-2012 dataset [17].
Dataset Splits Yes We train a linear classifier on top of frozen representations of the pre-trained model on the Image Net training set, and report the top-1 accuracy on the Image Net validation set, which is a standard and important protocol in SSL [88, 12, 31, 14, 86]. ... We fine-tune the pre-trained model on a small subset of Image Net for classification task. Specifically, we adopt the same fixed splits of 1% and 10% of Image Net training set as in [12] and report both top-1 and top-5 accuracies in Table 2.
Hardware Specification Yes The total amount of compute used for this research is approximately 1,600 GPU hours per experiment on 8 NVIDIA V100 GPUs.
Software Dependencies No Our implementation is based on PyTorch [50] and uses Detectron2 [79] for object detection and instance segmentation experiments. We use a single ResNet-50 [33] backbone for all Image Net linear probing and semi-supervised classification experiments. No specific version numbers for PyTorch or Detectron2 are provided.
Experiment Setup Yes We implement our method based on Py Torch and train ResNet-50 models using SGD with momentum. We set the base learning rate to 0.05, batch size to 256, and weight decay to 1e-4 for 100 epochs unless specified otherwise.