Influence Patterns for Explaining Information Flow in BERT

Authors: Kaiji Lu, Zifan Wang, Piotr Mardziel, Anupam Datta

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct an extensive empirical study of influence patterns for several NLP tasks: Subject-Verb Agreement (SVA), Reflexive Anaphora (RA), and Sentiment Analysis (SA). Our findings are summarized below.
Researcher Affiliation Academia Kaiji Lu , Zifan Wang, Piotr Mardziel, Anupam Datta Electrical and Computer Engineering Carnegie Mellon University Mountain View, CA 94089
Pseudocode Yes The detailed algorithm of GPR and analysis of its optimality can be found in Appendix B.1 and B.2.
Open Source Code No we will explore these limitations in future work and release our code and hope the proposed methods will serve as an insightful tool in future exploration.
Open Datasets Yes We consider two groups of NLP tasks: (1) subject-word agreement (SVA) and reflexive anaphora (RA)... (2) sentiment analysis(SA): we use 220 short examples (sentence length 17) from the evaluation set of the 2-class GLUE SST-2 sentiment analysis dataset [47].
Dataset Splits Yes For SST-2 we fine-tuned on the pretrained BERTBASE[7] with L = 12, A = 12. We sample 1000 sentences from each subtask evenly distributed across different sentence types (e.g. singular/plural subject & singular/plural intervening noun) with a fixed sentence structure
Hardware Specification Yes All computations are done with a Titan V on a machine with 64 GB of RAM. We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan V GPU used for this work.
Software Dependencies No The paper mentions using BERT models and references TensorFlow, but does not provide specific version numbers for any software or libraries used in their experiments.
Experiment Setup Yes Let the target node for SVA and RA tasks be the output of the Qo I score q(y) def = ycorrect ywrong. For instance, y IS y ARE for the sentence she [MASK] happy. Similarly, we use ypositive ynegative for sentiment analysis. We choose an uniform distribution over a linear path from xb to x as the distribution D in Def. 3 where the xb is chosen as the the input embedding of [MASK] because it can viewed a word with no information. For a given input token xi, we apply GPR differently depending on the sign of distributional influence g(x; q, D): if g(x; q, D) 0, we maximize the pattern influence towards q(y) at each iteration of the GPR otherwise we maximize pattern influence towards q(y).