Compositional Explanations of Neurons
Authors: Jesse Mu, Jacob Andreas
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We use this procedure to answer several questions on interpretability in models for vision and natural language processing. First, we examine the kinds of abstractions learned by neurons. In image classification, we find that many neurons learn highly abstract but semantically coherent visual concepts, while other polysemantic neurons detect multiple unrelated features; in natural language inference (NLI), neurons learn shallow lexical heuristics from dataset biases. Second, we see whether compositional explanations give us insight into model performance: vision neurons that detect human-interpretable concepts are positively correlated with task performance, while NLI neurons that fire for shallow heuristics are negatively correlated with task performance. |
| Researcher Affiliation | Academia | Jesse Mu Stanford University muj@stanford.edu Jacob Andreas MIT CSAIL jda@mit.edu |
| Pseudocode | No | The paper describes the explanation generation procedure verbally and visually in Figure 1, but it does not include a formal pseudocode block or algorithm. |
| Open Source Code | Yes | Code and data are available at github.com/jayelm/compexp. |
| Open Datasets | Yes | We take the final 512unit convolutional layer of a Res Net-18 [15] trained on the Places365 dataset [40], probing for concepts in the ADE20k scenes dataset [41] with atomic concepts C defined by annotations in the Broden dataset [5]. |
| Dataset Splits | Yes | We use the SNLI validation set as our probing dataset (10K examples). |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or cloud computing specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Spa Cy1' and 'GloVe embedding space [28]' but does not provide specific version numbers for software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | Since we cannot exhaustively search L(C), in practice we limit ourselves to formulas of maximum length N, by iteratively constructing formulas from primitives via beam search with beam size B = 10. At each step of beam search, we take the formulas already present in our beam, compose them with new primitives, measure Io U of these new formulas, and keep the top B new formulas by Io U, as shown in Figure 1e. |