Improving Deep Learning Interpretability by Saliency Guided Training

Authors: Aya Abdelsalam Ismail, Hector Corrada Bravo, Soheil Feizi

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply the saliency guided training procedure to various synthetic and real data sets from computer vision, natural language processing, and time series across diverse neural architectures, including Recurrent Neural Networks, Convolutional Networks, and Transformers. Through qualitative and quantitative evaluations, we show that saliency guided training procedure significantly improves model interpretability across various domains while preserving its predictive performance.
Researcher Affiliation Collaboration Aya Abdelsalam Ismail, Soheil Feizi , Héctor Corrada Bravo {asalam,sfeizi}@cs.umd.edu, corradah@gene.com Department of Computer Science, University of Maryland Data Science and Statistical Computing, Genentech, Inc.
Pseudocode Yes Algorithm 1: Saliency Guided Training
Open Source Code Yes Authors contributed equally Code: https://github.com/ayaabdelsalam91/saliency_guided_training
Open Datasets Yes MNIST [30] trained on a simple CNN [29], for CIFAR10 [25] trained on Res Net18 [16] and for BIRD [13] trained on VGG-16 [48]. Movie Reviews: [62] positive/negative sentiment classification for movie reviews. FEVER: [55] a fact extraction and verification dataset... e-SNLI: [8] a natural language inference task... We evaluated the saliency guided training on a multivariate time series benchmark proposed by Ismail et al. [21].
Dataset Splits No The paper states that training details, including data splits, are available in the supplementary materials, but does not provide specific train/validation/test split percentages or counts in the main text.
Hardware Specification No The paper states that 'all experiments on the same GPU' and that details on computational resources are available in supplementary materials, but does not provide specific hardware models (e.g., GPU model, CPU type) in the main text.
Software Dependencies No The paper mentions various models and libraries (e.g., Glove, LSTM, VGG-16) and states that code is available, but it does not specify software dependencies with version numbers (e.g., PyTorch version, Python version, specific library versions).
Experiment Setup No The paper mentions that hyperparameters are available in supplementary materials. In the main text, it states, 'Our training procedure requires two hyperparameters k and λ... we find that λ = 1 works well in all of our experiments.' However, it does not provide specific values for other common hyperparameters like learning rate, batch size, or number of epochs in the main text.