AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers

Authors: Reduan Achtibat, Sayed Mohammad Vakilzadeh Hatefi, Maximilian Dreyer, Aakriti Jain, Thomas Wiegand, Sebastian Lapuschkin, Wojciech Samek

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive evaluations against existing methods on LLa Ma 2, Mixtral 8x7b, Flan-T5 and vision transformer architectures, we demonstrate that our proposed approach surpasses alternative methods in terms of faithfulness and enables the understanding of latent representations, opening up the door for concept-based explanations. We provide an LRP library at https://github.com/ rachtibat/LRP-e Xplains-Transformers.
Researcher Affiliation Collaboration 1Fraunhofer Heinrich-Hertz-Institute, 10587 Berlin, Germany 2Technische Universit at Berlin, 10587 Berlin, Germany 3BIFOLD Berlin Institute for the Foundations of Learning and Data, 10587 Berlin, Germany.
Pseudocode No The paper describes its methodology and mathematical derivations in prose and equations, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes We provide an LRP library at https://github.com/ rachtibat/LRP-e Xplains-Transformers.
Open Datasets Yes In order to assess plausability, we utilize the SQu AD v2 Question-Answering (QA) dataset (Rajpurkar et al., 2018), which includes a ground truth mask indicating the correct answer within the question. ... For Wikipedia and IMDB faithfulness, we evaluated the pretrained LLa Ma 2-7b hosted on huggingface (Wolf et al., 2019) on 4000 randomly selected validation dataset samples (fixed set for all baselines). ... For Image Net faithfulness, we utilized the pretrained Vision Transformer B-16, L-16 and L-32 weights of the Py Torch model zoo (Paszke et al., 2019).
Dataset Splits Yes For Wikipedia and IMDB faithfulness, we evaluated the pretrained LLa Ma 2-7b hosted on huggingface (Wolf et al., 2019) on 4000 randomly selected validation dataset samples (fixed set for all baselines). ... For IMDB, we added a last linear layer to a frozen LLa Ma 2-7b model and finetuned only the last layer, which achieves 93% accuracy on the validation dataset.
Hardware Specification Yes We benchmark the runtime and peak GPU memory consumption for computing a single attribution for LLa Ma 2 with batch size 1 on a node with four A100-SXM4 40GB, 512 GB CPU RAM and 32 AMD EPYC 73F3 3.5 GHz.
Software Dependencies No The paper mentions software like PyTorch, Huggingface, zennit, and bitsandbytes with citations, but does not provide specific version numbers for these libraries (e.g., PyTorch 1.x or 2.x, or a specific version of Huggingface Transformers library). For example: "We utilize zennit (Anders et al., 2021) and its default settings to compute Integrated Gradients attribution maps..." and "...using bitsandbytes (Dettmers et al., 2024)."
Experiment Setup Yes For Smooth Grad, we set µ = 0 and perform a hyperparameter search for σ to find the optimal parameter. We utilize zennit (Anders et al., 2021) and its default settings to compute Smooth Grad attribution maps i.e. m = 20. ... Regarding SQu AD v2, we set At Man s p = 0.7 for Mixtral 8x7b and p = 0.9 for Flan-T5-XL. For Smooth Grad, we set σ = 0.1 for Mixtral 8x7b and Flan-T5-XL. ... Table B.5. Proposed composite for the Attn LRP and CP-LRP methods used for the Vision Transformer. Layer Type Rule Proposed Convolution Gamma(γ = 0.25) Linear Gamma(γ = 0.05) Linear Input Projection Epsilon Linear Output Projection Epsilon