Influence Patterns for Explaining Information Flow in BERT
Authors: Kaiji Lu, Zifan Wang, Piotr Mardziel, Anupam Datta
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct an extensive empirical study of influence patterns for several NLP tasks: Subject-Verb Agreement (SVA), Reflexive Anaphora (RA), and Sentiment Analysis (SA). Our findings are summarized below. |
| Researcher Affiliation | Academia | Kaiji Lu , Zifan Wang, Piotr Mardziel, Anupam Datta Electrical and Computer Engineering Carnegie Mellon University Mountain View, CA 94089 |
| Pseudocode | Yes | The detailed algorithm of GPR and analysis of its optimality can be found in Appendix B.1 and B.2. |
| Open Source Code | No | we will explore these limitations in future work and release our code and hope the proposed methods will serve as an insightful tool in future exploration. |
| Open Datasets | Yes | We consider two groups of NLP tasks: (1) subject-word agreement (SVA) and reflexive anaphora (RA)... (2) sentiment analysis(SA): we use 220 short examples (sentence length 17) from the evaluation set of the 2-class GLUE SST-2 sentiment analysis dataset [47]. |
| Dataset Splits | Yes | For SST-2 we fine-tuned on the pretrained BERTBASE[7] with L = 12, A = 12. We sample 1000 sentences from each subtask evenly distributed across different sentence types (e.g. singular/plural subject & singular/plural intervening noun) with a fixed sentence structure |
| Hardware Specification | Yes | All computations are done with a Titan V on a machine with 64 GB of RAM. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan V GPU used for this work. |
| Software Dependencies | No | The paper mentions using BERT models and references TensorFlow, but does not provide specific version numbers for any software or libraries used in their experiments. |
| Experiment Setup | Yes | Let the target node for SVA and RA tasks be the output of the Qo I score q(y) def = ycorrect ywrong. For instance, y IS y ARE for the sentence she [MASK] happy. Similarly, we use ypositive ynegative for sentiment analysis. We choose an uniform distribution over a linear path from xb to x as the distribution D in Def. 3 where the xb is chosen as the the input embedding of [MASK] because it can viewed a word with no information. For a given input token xi, we apply GPR differently depending on the sign of distributional influence g(x; q, D): if g(x; q, D) 0, we maximize the pattern influence towards q(y) at each iteration of the GPR otherwise we maximize pattern influence towards q(y). |