Measuring Self-Supervised Representation Quality for Downstream Classification Using Discriminative Features

Authors: Neha Kalibhat, Kanika Narang, Hamed Firooz, Maziar Sanjabi, Soheil Feizi

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Self-supervised learning (SSL) has shown impressive results in downstream classification tasks. However, there is limited work in understanding their failure modes and interpreting their learned representations. In this paper, we study the representation space of state-of-the-art self-supervised models including Sim CLR, Swa V, Mo Co, BYOL, DINO, Sim Siam, VICReg and Barlow Twins. ... We empirically observe that Q-Score can be used as a zero-shot predictor in distinguishing between correct and incorrect classifications for any SSL model achieving AUPRC of 91.45 on Image Net-100 and 78.78 AUPRC on Image Net1K.
Researcher Affiliation Collaboration 1University of Maryland, College Park 2Meta AI
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper uses implementations from 'solo-learn (da Costa et al. 2022)' but does not provide concrete access to its own source code or a repository link for the methodology described.
Open Datasets Yes We use a Res Net-50 encoder for our Image Net-1K experiments and Res Net-18 encoder for all other datasets. We discover discriminative features for each pre-trained model using the train set of each dataset. ... We include more results on CIFAR-10 (Krizhevsky, Nair, and Hinton 2009a), STL-10 (Coates, Lee, and Ng 2011) and CIFAR-100 (Krizhevsky, Nair, and Hinton 2009b) in the Appendix.
Dataset Splits No The paper mentions using train sets from datasets like ImageNet and standard evaluation protocols but does not explicitly provide specific percentages, sample counts, or detailed methodology for training/validation/test dataset splits.
Hardware Specification Yes We use a maximum of 4 NVIDIA RTX A4000 GPUs (16GB memory) for all our experiments.
Software Dependencies No The paper mentions using 'solo-learn' and the 'LARS optimizer' but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks.
Experiment Setup Yes We further-train each pre-trained model with and without Q-Score regularization (controlled by λ1 and λ2) using a low learning rate of 10 5 for 50 epochs. We find that λ1 = λ2 = 10 4 generally works well for fine tuning.