reproducibilityindex.ai

Measuring Per-Unit Interpretability at Scale Without Humans

Authors: Roland S. Zimmermann, David Klindt, Wieland Brendel

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate its predictive power through an interventional human psychophysics study. We demonstrate the usefulness of this measure by performing previously infeasible experiments: (1) A large-scale interpretability analysis across more than 70 million units from 835 computer vision models, and (2) an extensive analysis of how units transform during training.
Researcher Affiliation	Academia	Roland S. Zimmermann MPI-IS, Tübingen AI Center David Klindt Stanford Wieland Brendel MPI-IS, Tübingen AI Center
Pseudocode	No	The paper provides mathematical equations for the Machine Interpretability Score (MIS) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Online version, code and interactive visualizations available at brendel-group.github.io/mis.
Open Datasets	Yes	Both query images and explanations are chosen from the training set of Image Net-2012 [40].
Dataset Splits	No	The paper mentions using the 'training set of Image Net-2012' for query images and explanations, and refers to a 'training recipe' for a ResNet-50, but it does not explicitly state the specific percentages or counts for training, validation, and test splits for its experiments.
Hardware Specification	Yes	Evaluating all units of a model takes, on average and varying depending on the model’s size, less than one hour on a GPU (e.g., NVIDIA RTX 2080-TI or V100).
Software Dependencies	No	The paper mentions using 'Dream Sim' and a 'training recipe' but does not specify version numbers for key software components or libraries like Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	To choose α, we use the interpretability annotations of IMI [50]: We optimize α over a randomly chosen subset of just 5% of the annotated units to approximately match the value range of human interpretability scores, resulting in α = 0.16. ... As they used up to 20 tasks per unit, we average over N = 20. ... For this, we train a Res Net-50 on Image Net-2012, following the training recipe A3 of Wightman et al. [45], for 100 epochs.