Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Representational Similarity via Interpretable Visual Concepts

Authors: Neehar Kondapaneni, Oisin Mac Aodha, Pietro Perona

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive evaluation across different vision model architectures and training protocols to demonstrate its effectiveness. Project Page: RSVC Code: github.com/nkondapa/RSVC
Researcher Affiliation	Academia	Neehar Kondapaneni1 Oisin Mac Aodha2 Pietro Perona1 1Caltech 2University of Edinburgh
Pseudocode	Yes	1: for each i = 1 to K do 2: Ui 2 1 = Copy U1 3: Ui 2 1[:, i] = U2 1[:, i] 4: Ai 2 1 = Ui 2 1W1 5: zi 2 1 = h( Ai 2 1) 6: yi 2 1 = arg max( zi 2 1) 7: end for
Open Source Code	Yes	Project Page: RSVC Code: github.com/nkondapa/RSVC
Open Datasets	Yes	All models were trained on Image Net (Deng et al., 2009). For our exploration with DINO and MAE, we ﬁnetune the models on NABirds (Van Horn et al., 2015). ... We train a Res Net-18 model on a combined dataset of NABirds and Stanford Cars (Krause et al., 2013)
Dataset Splits	Yes	The regression model is trained on a set of training images and evaluated on held out test images where the predicted outputs of the regressor are compared to the true neural responses using Pearson correlation, giving a score between -1 and 1. ... For each concept and class, we train ﬁve lasso-regression models on different equally sized folds.
Hardware Specification	Yes	All experiments were conducted using on a machine with an AMD Ryzen 7 3700X 8-Core Processor and a single Ge Force RTX 4090 GPU.
Software Dependencies	No	The paper mentions: timm library (Wightman, 2019), scikit-learn (Pedregosa et al., 2011), Celer library (Massias et al., 2018), and xplique library (Fel et al., 2022a). However, it does not provide specific version numbers for any of these software dependencies.
Experiment Setup	Yes	For the l1 penalty, we sweep λ on a subset of data and ﬁnd λ = 0.1 to be a reasonable choice (Fig. A13). Additionally, we set the number of concepts k = 10 to balance reconstruction error and computational cost (Fig. A14). ... We use 50 images for each class from the training set of the model. Images are resized to 224 224. We use a patch size of 64 64 resulting in 16 patches per image. ... For all experiments we integrate over 30 steps.