Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Quantifying Task-relevant Similarities in Representations Using Decision Variable Correlations

Authors: Yu (Eric) Qian, Wilson Geisler, Xue-Xin Wei

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Here, we propose a new approach to characterize the similarity of the decision strategies of two observers (models or brains) using decision variable correlation (DVC). DVC quantifies the image-by-image correlation between the decoded decisions based on the internal neural representations in a classification task. Thus, it can capture task-relevant information rather than general representational alignment. We evaluate DVC using monkey V4/IT recordings and network models trained on image classification tasks. We find that model model similarity is comparable to monkey-monkey similarity, whereas model monkey similarity is consistently lower.
Researcher Affiliation	Academia	Yu (Eric) Qian Department of Neuroscience The University of Texas at Austin EMAIL Wilson S. Geisler Department of Psychology The University of Texas at Austin EMAIL Xue-Xin Wei Department of Neuroscience The University of Texas at Austin EMAIL
Pseudocode	No	The paper describes the DVC method and its implementation steps in text and diagrams (Figure 1 and A.1), but it does not present any formal pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https://github.com/wei-bbc-lab/DVC.
Open Datasets	Yes	We used the publicly available dataset of objects rendered on naturalistic scenes [38]. We study a set of models (n=43, obtained from Torchvision, [39]) pretrained on Image Net-1k, an influential benchmark in image classification.
Dataset Splits	Yes	We used 5-fold cross-validated logistic regression to obtain model decisions as well as monkey decisions.
Hardware Specification	Yes	All experiments were performed on Intel(R) Core i7-14700K CPU without resorting to GPU usage.
Software Dependencies	No	The paper mentions software like Torchvision and Timm (PyTorch Image Models) but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	To address this problem, we use dimensionality reduction (e.g., PCA) to reduce the representations to the same number of features before using LDA to decode the underlying DV 1. 125 PC dimensions. See Appendix C.3 and C.4 for experiments that demonstrate the robustness of the results. We measured the similarity between decoded DVs using Pearson Correlation. We used 5-fold cross-validated logistic regression to obtain model decisions as well as monkey decisions.