Self-supervised Learning is More Robust to Dataset Imbalance
Authors: Hong Liu, Jeff Z. HaoChen, Adrien Gaidon, Tengyu Ma
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | First, we find out via extensive experiments that off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations. The performance gap between balanced and imbalanced pre-training with SSL is significantly smaller than the gap with supervised learning, across sample sizes, for both in-domain and, especially, out-of-domain evaluation. |
| Researcher Affiliation | Collaboration | Hong Liu Stanford University hliu99@stanford.edu Jeff Z. Hao Chen Stanford University jhaochen@stanford.edu Adrien Gaidon Toyota Research Institute adrien.gaidon@tri.global Tengyu Ma Stanford University tengyuma@stanford.edu |
| Pseudocode | Yes | Algorithm 1 Reweighted Sharpness-Aware Minimization (rw SAM) |
| Open Source Code | Yes | Code is available at https://github.com/Liuhong99/Imbalanced-SSL. |
| Open Datasets | Yes | We pre-train the representations on variants of Image Net (Russakovsky et al., 2015) or CIFAR-10 (Krizhevsky & Hinton, 2009) with a wide range of numbers of examples and ratios of imbalance. |
| Dataset Splits | Yes | For ID evaluation, we use the original CIFAR-10 or Image Net training set for the training phase of linear probe and use the original validation set for the final evaluation. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models (e.g., NVIDIA A100), CPU models, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions various software components and models like Res Net-18, Res Net-50, Mo Co v2, Sim Siam, Sim CLR, Grad-CAM, and Rand Augment, but it does not specify any version numbers for these software dependencies. |
| Experiment Setup | Yes | For self-supervised learning, the initial learning rate on the standard Image Net-LT is set to 0.025 with batch-size 256. We train the model for 300 epochs on the standard Image Net-LT and adopt cosine learning rate decay following (He et al., 2020; Chen & He, 2021). We set the initial learning rate to 30 when training the linear head with batch-size 4096 and train for 100 epochs in total. |