Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Reproducibility Study of "Improving Interpretation Faithfulness For Vision Transformers"

Authors: Meher Changlani, Benjamin Hucko, Ioannis Kechagias, Aswin Krishna Mahadevan

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This paper attempts to reproduce the findings of the study "Improving Interpretation Faithfulness For Vision Transformers" Hu et al. (2024). The authors focus on making visual transformers (Vi Ts) more robust to adversarial attacks, and calling these robust Vi Ts as faithful Vi Ts (FVi Ts). In their paper they propose a universal method to transform Vi Ts to FVi Ts called denoised diffusion smoothing (DDS). The reproduction of the authors study suffers from certain challenges, but the main claims still hold. Furthermore, this study extends the original paper by trying different diffusion models for DDS and tries to generalize the increased robustness of FVi Ts.
Researcher Affiliation Academia Meher Changlani EMAIL University of Amsterdam Aswin Krishna Mahadevan EMAIL University of Amsterdam Benjamin Hucko EMAIL University of Amsterdam Ioannis Kechagias EMAIL University of Amsterdam
Pseudocode Yes Algorithm 1 Denoised Diffusion Smoothing (DDS) Algorithm 2 FVi T Algorithm 3 Diffusion Denoising Algorithm 4 Optimal Step Selection (get_optimal_number_of_steps)
Open Source Code Yes To replicate the study, we utilized the public repository provided by the authors, which includes code for the baselines, post-hoc methods, models, and helper functions for loading datasets. Our adapted code can be found on Git Hub.
Open Datasets Yes ILSVRC-2012 Image Net. A large-scale dataset widely used for image classification and object recognition tasks. Image Net-C (Extension). Hendrycks & Dietterich (2019). This dataset consists of 15 diverse corruption types applied to the validation images of the Image Net dataset. Image Net-Segmentation Subset. A subset of the Image Net dataset (Guillaumin et al., 2014) that provides pixel-level object-background segmentations for 500,000 images spanning 577 object categories. COCO. Also known as "Common Objects in Context" dataset (Lin et al., 2015) is a large-scale image recognition, segmentation, and captioning dataset. Cityscapes. The Cityscapes dataset (Cordts et al., 2016) focuses on semantic understanding of urban street scenes and includes 5,000 annotated images with fine annotations and 20,000 annotated images with coarse annotations, spanning 30 classes.
Dataset Splits Yes ILSVRC-2012 Image Net. It consists of 1.28 million training images, 50,000 validation images, and 100,000 test images, spanning 1,000 object categories. We used 10,000 images each from the defocus blur subcategory within blur and the elastic transformation subcategory within digital.
Hardware Specification Yes For all our experiments we used using a single GPU, an Nvidia A100 80GB.
Software Dependencies No We observe that there were discrepancies in the implementation that hindered our reproduction of the results. Despite these challenges, we were able to address several shortcomings wherever possible. For the Image Net dataset, the authors used diffusion model weights provided by Ho et al. (2020), whereas for the COCO and Cityscapes datasets, the authors trained their own diffusion models but did not make these models publicly available. Even the provided environment was missing several packages or had compatibility issues between the versions of dependencies, making the initial setup for the reproducibility a tedious task.
Experiment Setup Yes Algorithm 1 Denoised Diffusion Smoothing (DDS) Input: Image x, noise level δ, diffusion steps N, schedule bounds β1, β2. Flags: Denoising (default: True), Smoothing (default: True). Output: A denoised and smoothed image ˆx. t get_optimal_number_of_steps(δ, β1, β2, N) Optimal step selection. Experiment 3 (Claims 3, 4): Conduct positive and negative perturbation tests for all Vi Ts across all baselines under default attack positions while varying the attack radius from 0 to 32/255. ... For each baseline attribution method, compare the AUC (Area-under-curve) of the accuracy vs pixel perturbation percentage (ranging from 0-90%) plot for every attack radius. Table 2: Top-1 classification accuracy on the ILSVRC-2012 validation dataset with default attack. Table 4: Top-1 Accuracy on the ILSVRC-2012 validation dataset with attack.