Self-Attention Attribution: Interpreting Information Interactions Inside Transformer
Authors: Yaru Hao, Li Dong, Furu Wei, Ke Xu12963-12971
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We take BERT as an example to conduct extensive studies. For example, on the MNLI dataset, adding one adversarial pattern into the premise can drop the accuracy of entailment from 82.87% to 0.8%. |
| Researcher Affiliation | Collaboration | 1 Beihang University 2 Microsoft Research {haoyaru@,kexu@nlsde.}buaa.edu.cn {lidong1,fuwei}@microsoft.com |
| Pseudocode | Yes | Algorithm 1 Attribution Tree Construction |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code for the proposed ATTATTR method, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We perform BERT fine-tuning and conduct experiments on four classification datasets. MNLI (Williams, Nangia, and Bowman 2018)... RTE (Dagan, Glickman, and Magnini 2006; Bar-Haim et al. 2006; Giampiccolo et al. 2007; Bentivogli et al. 2009)... SST-2 (Socher et al. 2013)... MRPC (Dolan and Brockett 2005)... |
| Dataset Splits | Yes | We use the same data split as in (Wang et al. 2019). We calculate Ih on 200 examples sampled from the held-out dataset. |
| Hardware Specification | Yes | For a sequence of 128 tokens, the attribution time of the BERT-base model takes about one second on an Nvidia-v100 GPU card. |
| Software Dependencies | No | The paper mentions using 'BERT-base-cased' and fine-tuning settings suggested in 'Devlin et al. (2019)', but does not provide specific software version numbers for libraries or environments like Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | When fine-tuning BERT, we follow the settings and the hyper-parameters suggested in (Devlin et al. 2019). In our experiments, we set m to 20, which performs well in practice. We set τ = 0.4 for layers l < 12. ... we set τ to 0 for the last layer. |