Attention-based Interpretability with Concept Transformers
Authors: Mattia Rigotti, Christoph Miksovic, Ioana Giurgiu, Thomas Gschwind, Paolo Scotton
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our Concept Transformer module on established explainability benchmarks and show how it can be used to infuse domain knowledge into classifiers to improve accuracy, and conversely to extract concept-based explanations of classification outputs. |
| Researcher Affiliation | Industry | Mattia Rigotti, Christoph Miksovic, Ioana Giurgiu, Thomas Gschwind & Paolo Scotton IBM Research Zurich, Switzerland {mrg,cmi,igi,thg,psc}@zurich.ibm.com |
| Pseudocode | Yes | C APPENDIX: CONCEPTTRANSFORMER PYTORCH CODE |
| Open Source Code | Yes | Code to reproduce our results is available at: https://github.com/ibm/concept_transformer. |
| Open Datasets | Yes | We validate our approach on three image benchmark datasets, MNIST Even/Odd (Barbiero et al., 2021), CUB-200-2011 (Welinder et al., 2010), and a PY (Farhadi et al., 2009). |
| Dataset Splits | Yes | Figure 2 shows the accuracy on the test set (left) and explanation loss during validation (right), relative to the number of samples used at training, which varies from 100 to 7000. [...] For validation and testing, only resizing and normalization were applied. |
| Hardware Specification | No | The paper does not specify the hardware used for the experiments (e.g., GPU models, CPU types, or memory). |
| Software Dependencies | No | We use the Albumentations library by Buslaev et al. (2020). The paper mentions a software library but does not provide specific version numbers for all key software components (e.g., PyTorch version used in Appendix C). |
| Experiment Setup | Yes | At training time, the following augmentations were applied to the individual object samples: resizing to a standardized format (H W = 320 320 pixels), random horizontal flipping with probability p = 0.5, random rotations in the range of 15 based on an uniform probability distribution and normalization. For validation and testing, only resizing and normalization were applied. |