Improving Deep Learning Interpretability by Saliency Guided Training
Authors: Aya Abdelsalam Ismail, Hector Corrada Bravo, Soheil Feizi
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply the saliency guided training procedure to various synthetic and real data sets from computer vision, natural language processing, and time series across diverse neural architectures, including Recurrent Neural Networks, Convolutional Networks, and Transformers. Through qualitative and quantitative evaluations, we show that saliency guided training procedure significantly improves model interpretability across various domains while preserving its predictive performance. |
| Researcher Affiliation | Collaboration | Aya Abdelsalam Ismail, Soheil Feizi , Héctor Corrada Bravo {asalam,sfeizi}@cs.umd.edu, corradah@gene.com Department of Computer Science, University of Maryland Data Science and Statistical Computing, Genentech, Inc. |
| Pseudocode | Yes | Algorithm 1: Saliency Guided Training |
| Open Source Code | Yes | Authors contributed equally Code: https://github.com/ayaabdelsalam91/saliency_guided_training |
| Open Datasets | Yes | MNIST [30] trained on a simple CNN [29], for CIFAR10 [25] trained on Res Net18 [16] and for BIRD [13] trained on VGG-16 [48]. Movie Reviews: [62] positive/negative sentiment classification for movie reviews. FEVER: [55] a fact extraction and verification dataset... e-SNLI: [8] a natural language inference task... We evaluated the saliency guided training on a multivariate time series benchmark proposed by Ismail et al. [21]. |
| Dataset Splits | No | The paper states that training details, including data splits, are available in the supplementary materials, but does not provide specific train/validation/test split percentages or counts in the main text. |
| Hardware Specification | No | The paper states that 'all experiments on the same GPU' and that details on computational resources are available in supplementary materials, but does not provide specific hardware models (e.g., GPU model, CPU type) in the main text. |
| Software Dependencies | No | The paper mentions various models and libraries (e.g., Glove, LSTM, VGG-16) and states that code is available, but it does not specify software dependencies with version numbers (e.g., PyTorch version, Python version, specific library versions). |
| Experiment Setup | No | The paper mentions that hyperparameters are available in supplementary materials. In the main text, it states, 'Our training procedure requires two hyperparameters k and λ... we find that λ = 1 works well in all of our experiments.' However, it does not provide specific values for other common hyperparameters like learning rate, batch size, or number of epochs in the main text. |