Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
CNN Interpretability with Multivector Tucker Saliency Maps for Self-Supervised Models
Authors: Aymene Mohammed Bouayed, Samuel Deslauriers-gauthier, Adrian IACOVELLI, David Naccache
TMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Quantitative evaluations on supervised classification models demonstrate that TSM, Multivec-Eigen CAM, and MTSM achieve competitive performance with label-dependent methods. Moreover, TSM enhances interpretability by approximately 50% over Eigen CAM for both supervised and self-supervised models. Multivec-Eigen CAM and MTSM further advance state-of-the-art interpretability performance on self-supervised models, with MTSM achieving the best results. |
| Researcher Affiliation | Collaboration | Aymene Mohammed Bouayed EMAIL DIÉNS, ÉNS, CNRS, PSL University, Paris, France Be-Ys Research, France Samuel Deslauriers-Gauthier EMAIL Centre Inria d Université Côte d Azur, Nice, France Adrian Iaccovelli EMAIL Be-Ys Research, France David Naccache EMAIL DIÉNS, ÉNS, CNRS, PSL University, Paris, France |
| Pseudocode | Yes | Supplementary Material Hereafter, we present the implementation of the TSM, Multivec-Eigen CAM en MTSM methods proposed in this work using the Py Torch library for Class Activation Map methods, as referenced in Jacob Gildenblat et al. (2021). import torch import numpy as np from pytorch_grad_cam .base_cam import Base CAM from tensorly. decomposition import tucker class TSM(Base CAM): def __init__(self , model , target_layers , use_cuda =False , reshape_transform =None): super(TSM , self). __init__(model , target_layers , use_cuda , reshape_transform , uses_gradients =False) def get_cam_image (self , input_tensor , target_layer , target_category , activation_batch , grads , eigen_smooth ): feature_maps = torch.from_numpy ( activation_batch ) VT = [tucker( feature_map .numpy (), rank=list( feature_map .shape))[1][0] for feature_map in feature_maps ] VT = torch. from_numpy(np.array(VT)) VT = VT[:,0].unsqueeze(-1).unsqueeze(-1) S = ( feature_maps * VT).sum(dim=1) return S.abs ().numpy () |
| Open Source Code | Yes | Supplementary Material Hereafter, we present the implementation of the TSM, Multivec-Eigen CAM en MTSM methods proposed in this work using the Py Torch library for Class Activation Map methods, as referenced in Jacob Gildenblat et al. (2021). |
| Open Datasets | Yes | We evaluate the different saliency map calculation methods in this work on the Image Net 2012 (Deng et al., 2009) and Pascal VOC (Everingham et al., 2010) datasets. |
| Dataset Splits | Yes | Image Net 2012 For this dataset, we use the 50, 000 validation images of the Image Net ILSVRC 2012 dataset (Deng et al., 2009). We mainly use this dataset to calculate the Average Drop, Average Increase and Mean Squared Error metrics. Pascal VOC Owing to the availability of the segmentation masks on the Pascal VOC 2012 challenge dataset (Everingham et al., 2010), we harness this dataset to calculate the mean Intersection over Union metric (Jaccard, 1901). |
| Hardware Specification | No | This work received access to the High-Performance Computing (HPC) resources of Meso PSL, financed by the Region Île-de-France and the Equip@Meso project (reference ANR-10-EQPX-29-01) of the Investissements d avenir program supervised by the Agence nationale pour la recherche. |
| Software Dependencies | No | The implementation of the different CAM based saliency map methods is done via the Py Torch library for CAM methods (Jacob Gildenblat et al., 2021), the Deep LIFT and Deep SHAP methods are implemented using the Captum Python library (Kokhlikyan et al., 2020) and the LIME method is imported from the official implementation in (Ribeiro et al., 2016). |
| Experiment Setup | Yes | B.1 Selected layers per model for the CAM inference In Table 4, we outline the chosen target layers for each examined model. These layers serve as the source from which we extract the feature map tensor for computing the saliency map. Table 4: Designated layers for saliency map calculation across all tested models and CAM methods in this paper. Model Layer Resnet50 (He et al., 2016) model.layer4[-1] Conv Next (Liu et al., 2022) model.features VGG16 (Simonyan & Zisserman, 2014) model.features Moco V2 (Chen et al., 2020b) model.layer4[-1] SWAV (Caron et al., 2020) model.layer4[-1] Barlow Twins (Zbontar et al., 2021) model.layer4[-1] Vic Reg L Resnet50 (Bardes et al., 2022) model.layer4[-1] Vic Reg L Conv Next (Bardes et al., 2022) model.stages[3][2].dwconv |