Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
[Re] Improving Interpretation Faithfulness for Vision Transformers
Authors: Izabela Kurek, Wojciech Trejter, Stipe Frkovic, Andro Erdelez
TMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This work aims to reproduce the results of Faithful Vision Transformers (FVi Ts) proposed by Hu et al. (2024) alongside interpretability methods for Vision Transformers from Chefer et al. (2021) and Xu et al. (2022). We investigate claims made by Hu et al. (2024), namely that the usage of Diffusion Denoised Smoothing (DDS) improves interpretability robustness to (1) attacks in a segmentation task and (2) perturbation and attacks in a classification task. Our results broadly agree with the original study s findings, although minor discrepancies were found and discussed. |
| Researcher Affiliation | Academia | Izabela Kurek EMAIL University of Amsterdam Wojciech Trejter EMAIL University of Amsterdam Stipe Frković EMAIL University of Amsterdam Andro Erdelez EMAIL University of Amsterdam |
| Pseudocode | No | In their paper, Hu et al. (2024) present two algorithms based on the previously presented mathematical framework, allowing for finding FVi Ts computationally. Taking into account aforementioned issues, we cannot guarantee the correctness of Algorithm 2 s ability to find the faithfulness region of an FVi T, as it is based on Conjecture B.2. Additionally, Algorithm 1 of Hu et al. (2024) is an adapted Algorithm 1 put forward by Carlini et al. (2023) for guaranteeing adversarial robustness of any classifier. |
| Open Source Code | Yes | Our code is available at github.com/aerdelez/re-fvit. |
| Open Datasets | Yes | The Image Net-segmentation subset was used for the image segmentation task (Guillaumin et al., 2014).4 The subset contains images sourced from the full Image Net dataset with ground-truth segmentations. The Image Net LSVRC 2012 Validation Set (Russakovsky et al., 2015) was used for the classification under perturbation and attack task.5 |
| Dataset Splits | No | The Image Net-segmentation subset was used for the image segmentation task (Guillaumin et al., 2014).4 ... Therefore, with a model pre-trained for Image Net classification, there is no need for fine-tuning or data splitting. ... The Image Net LSVRC 2012 Validation Set (Russakovsky et al., 2015) was used for the classification under perturbation and attack task.5 ... A total of 4000 images were sampled from the validation dataset. |
| Hardware Specification | Yes | The experiments were run on an NVIDIA A100 GPU partitioned into two instances using Multi-Instance GPU technology, effectively utilizing half of the GPU; in addition, 9 cores of Intel Xeon CPUs and 60GB of RAM were utilized. |
| Software Dependencies | No | The code used for the experiments was built on top of Hu et al. (2024), which extended the code from Chefer et al. (2021) by providing the qualitative demo. We further extended Chefer et al. (2021) s code by adding PGD and DDS to both tasks based on the demo implementation by Hu et al. (2024). We also added the AR interpretability method to the segmentation task following the implementation by Xu et al. (2022). |
| Experiment Setup | Yes | No hyperparameter search was performed for the experiments; pre-trained Vi T and diffusion models were used. Unless specified otherwise, the replicated experiments use the default hyperparameters proposed for DDS, which are a noise level of 8 255 and 45 backward steps for denoising. Following the demo implementation of Hu et al. (2024), the number of samples created was 10 for the qualitative visualizations before adversarial attack and 2 in all other cases. A PGD attack was used in both tasks; the parameters were left to their default implementation values, as used by Hu et al. (2024). Specifically, unless specified otherwise, the maximum perturbation was set to 8 255, the step size to 2 255, and the number of steps to 10. A random seed of 44 was used for all experiments. |