Compositional Explanations of Neurons

Authors: Jesse Mu, Jacob Andreas

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We use this procedure to answer several questions on interpretability in models for vision and natural language processing. First, we examine the kinds of abstractions learned by neurons. In image classification, we find that many neurons learn highly abstract but semantically coherent visual concepts, while other polysemantic neurons detect multiple unrelated features; in natural language inference (NLI), neurons learn shallow lexical heuristics from dataset biases. Second, we see whether compositional explanations give us insight into model performance: vision neurons that detect human-interpretable concepts are positively correlated with task performance, while NLI neurons that fire for shallow heuristics are negatively correlated with task performance.
Researcher Affiliation Academia Jesse Mu Stanford University muj@stanford.edu Jacob Andreas MIT CSAIL jda@mit.edu
Pseudocode No The paper describes the explanation generation procedure verbally and visually in Figure 1, but it does not include a formal pseudocode block or algorithm.
Open Source Code Yes Code and data are available at github.com/jayelm/compexp.
Open Datasets Yes We take the final 512unit convolutional layer of a Res Net-18 [15] trained on the Places365 dataset [40], probing for concepts in the ADE20k scenes dataset [41] with atomic concepts C defined by annotations in the Broden dataset [5].
Dataset Splits Yes We use the SNLI validation set as our probing dataset (10K examples).
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, or cloud computing specifications used for running the experiments.
Software Dependencies No The paper mentions 'Spa Cy1' and 'GloVe embedding space [28]' but does not provide specific version numbers for software dependencies or libraries used in the experiments.
Experiment Setup Yes Since we cannot exhaustively search L(C), in practice we limit ourselves to formulas of maximum length N, by iteratively constructing formulas from primitives via beam search with beam size B = 10. At each step of beam search, we take the formulas already present in our beam, compose them with new primitives, measure Io U of these new formulas, and keep the top B new formulas by Io U, as shown in Figure 1e.