Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Measuring the Interpretability of Unsupervised Representations via Quantized Reversed Probing
Authors: Iro Laina, Yuki M Asano, Andrea Vedaldi
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We use our method to evaluate a large number of self-supervised representations, ranking them by interpretability, highlight the differences that emerge compared to the standard evaluation with linear probes and discuss several qualitative insights. We use our method to evaluate a wide range of recent self-supervised representation learning and clustering techniques. |
| Researcher Affiliation | Academia | Iro Laina University of Oxford EMAIL Yuki M. Asano University of Amsterdam EMAIL Andrea Vedaldi University of Oxford EMAIL |
| Pseudocode | No | The paper describes the method using text and diagrams (Figure 1), but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code at: https://github.com/iro-cp/ssl-qrp. |
| Open Datasets | Yes | in our experiments we focus on Image Net (IN-1k) (Deng et al., 2009) as the target data due to the large availability of pre-trained self-supervised models on this dataset. We apply the model to our target datasets and keep only object categories predicted with a confidence higher than 0.5. We use a Deep Lab-v2 model (Chen et al., 2017) trained for segmentation on MS COCO 2017 (Lin et al., 2014; Caesar et al., 2018). |
| Dataset Splits | Yes | We divide the data into train and test sets by splitting all cluster assignments with a 80/20 ratio and stratified sampling; from the training split we also reserve 20% of the data for validation. |
| Hardware Specification | Yes | Our method computes cluster assignments using the efficient K-means implementation of faiss (which takes less than 5min for 256k 2048-d vectors on 4 NVIDIA RTX A4000). Training of the linear model converges in less than 100 epochs in a matter of minutes on a single GPU (1 epoch takes 2sec). |
| Software Dependencies | No | The paper mentions using 'faiss' for K-means but does not specify its version number or any other software dependencies with explicit version details. |
| Experiment Setup | Yes | We train for up to 100 epochs with batch size 512 and optimize using SGD with a momentum of 0.9 and initial learning rate of 3.5 which is further reduced by a factor of 10 at epochs 60 and 80. We also add L2-regularization with weight 3 10 6. |