reproducibilityindex.ai

Measuring Self-Supervised Representation Quality for Downstream Classification Using Discriminative Features

Authors: Neha Kalibhat, Kanika Narang, Hamed Firooz, Maziar Sanjabi, Soheil Feizi

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Self-supervised learning (SSL) has shown impressive results in downstream classiﬁcation tasks. However, there is limited work in understanding their failure modes and interpreting their learned representations. In this paper, we study the representation space of state-of-the-art self-supervised models including Sim CLR, Swa V, Mo Co, BYOL, DINO, Sim Siam, VICReg and Barlow Twins. ... We empirically observe that Q-Score can be used as a zero-shot predictor in distinguishing between correct and incorrect classiﬁcations for any SSL model achieving AUPRC of 91.45 on Image Net-100 and 78.78 AUPRC on Image Net1K.
Researcher Affiliation	Collaboration	1University of Maryland, College Park 2Meta AI
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper uses implementations from 'solo-learn (da Costa et al. 2022)' but does not provide concrete access to its own source code or a repository link for the methodology described.
Open Datasets	Yes	We use a Res Net-50 encoder for our Image Net-1K experiments and Res Net-18 encoder for all other datasets. We discover discriminative features for each pre-trained model using the train set of each dataset. ... We include more results on CIFAR-10 (Krizhevsky, Nair, and Hinton 2009a), STL-10 (Coates, Lee, and Ng 2011) and CIFAR-100 (Krizhevsky, Nair, and Hinton 2009b) in the Appendix.
Dataset Splits	No	The paper mentions using train sets from datasets like ImageNet and standard evaluation protocols but does not explicitly provide specific percentages, sample counts, or detailed methodology for training/validation/test dataset splits.
Hardware Specification	Yes	We use a maximum of 4 NVIDIA RTX A4000 GPUs (16GB memory) for all our experiments.
Software Dependencies	No	The paper mentions using 'solo-learn' and the 'LARS optimizer' but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks.
Experiment Setup	Yes	We further-train each pre-trained model with and without Q-Score regularization (controlled by λ1 and λ2) using a low learning rate of 10 5 for 50 epochs. We ﬁnd that λ1 = λ2 = 10 4 generally works well for ﬁne tuning.