AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
Authors: Reduan Achtibat, Sayed Mohammad Vakilzadeh Hatefi, Maximilian Dreyer, Aakriti Jain, Thomas Wiegand, Sebastian Lapuschkin, Wojciech Samek
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive evaluations against existing methods on LLa Ma 2, Mixtral 8x7b, Flan-T5 and vision transformer architectures, we demonstrate that our proposed approach surpasses alternative methods in terms of faithfulness and enables the understanding of latent representations, opening up the door for concept-based explanations. We provide an LRP library at https://github.com/ rachtibat/LRP-e Xplains-Transformers. |
| Researcher Affiliation | Collaboration | 1Fraunhofer Heinrich-Hertz-Institute, 10587 Berlin, Germany 2Technische Universit at Berlin, 10587 Berlin, Germany 3BIFOLD Berlin Institute for the Foundations of Learning and Data, 10587 Berlin, Germany. |
| Pseudocode | No | The paper describes its methodology and mathematical derivations in prose and equations, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We provide an LRP library at https://github.com/ rachtibat/LRP-e Xplains-Transformers. |
| Open Datasets | Yes | In order to assess plausability, we utilize the SQu AD v2 Question-Answering (QA) dataset (Rajpurkar et al., 2018), which includes a ground truth mask indicating the correct answer within the question. ... For Wikipedia and IMDB faithfulness, we evaluated the pretrained LLa Ma 2-7b hosted on huggingface (Wolf et al., 2019) on 4000 randomly selected validation dataset samples (fixed set for all baselines). ... For Image Net faithfulness, we utilized the pretrained Vision Transformer B-16, L-16 and L-32 weights of the Py Torch model zoo (Paszke et al., 2019). |
| Dataset Splits | Yes | For Wikipedia and IMDB faithfulness, we evaluated the pretrained LLa Ma 2-7b hosted on huggingface (Wolf et al., 2019) on 4000 randomly selected validation dataset samples (fixed set for all baselines). ... For IMDB, we added a last linear layer to a frozen LLa Ma 2-7b model and finetuned only the last layer, which achieves 93% accuracy on the validation dataset. |
| Hardware Specification | Yes | We benchmark the runtime and peak GPU memory consumption for computing a single attribution for LLa Ma 2 with batch size 1 on a node with four A100-SXM4 40GB, 512 GB CPU RAM and 32 AMD EPYC 73F3 3.5 GHz. |
| Software Dependencies | No | The paper mentions software like PyTorch, Huggingface, zennit, and bitsandbytes with citations, but does not provide specific version numbers for these libraries (e.g., PyTorch 1.x or 2.x, or a specific version of Huggingface Transformers library). For example: "We utilize zennit (Anders et al., 2021) and its default settings to compute Integrated Gradients attribution maps..." and "...using bitsandbytes (Dettmers et al., 2024)." |
| Experiment Setup | Yes | For Smooth Grad, we set µ = 0 and perform a hyperparameter search for σ to find the optimal parameter. We utilize zennit (Anders et al., 2021) and its default settings to compute Smooth Grad attribution maps i.e. m = 20. ... Regarding SQu AD v2, we set At Man s p = 0.7 for Mixtral 8x7b and p = 0.9 for Flan-T5-XL. For Smooth Grad, we set σ = 0.1 for Mixtral 8x7b and Flan-T5-XL. ... Table B.5. Proposed composite for the Attn LRP and CP-LRP methods used for the Vision Transformer. Layer Type Rule Proposed Convolution Gamma(γ = 0.25) Linear Gamma(γ = 0.05) Linear Input Projection Epsilon Linear Output Projection Epsilon |