Stability Guarantees for Feature Attributions with Multiplicative Smoothing

Authors: Anton Xue, Rajeev Alur, Eric Wong

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Mu S on vision and language models with various feature attribution methods, such as LIME and SHAP, and demonstrate that Mu S endows feature attributions with non-trivial stability guarantees.
Researcher Affiliation Academia Anton Xue Rajeev Alur Eric Wong Department of Computer and Information Science University of Pennsylvania Philadelphia, PA 19104 {antonxue,alur,exwong}@seas.upenn.edu
Pseudocode No The paper describes algorithms in text and figures, but does not include a formally labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code No The paper does not contain an explicit statement about releasing source code for their methodology or a direct link to a code repository.
Open Datasets Yes We use Image Net1K [31] as our vision dataset and Tweet Eval [32] sentiment analysis as our language dataset.
Dataset Splits No The paper mentions training, but does not explicitly state dataset splits (e.g., 80/10/10) for training, validation, and testing.
Hardware Specification No The paper mentions using specific models like Vision Transformer and ResNet50, but does not provide details on the hardware (e.g., specific GPUs, CPUs) used for experiments.
Software Dependencies No The paper mentions using Adam [61] as an optimizer, but does not provide version numbers for any software dependencies.
Experiment Setup Yes Training Details We used Adam [61] as our optimizer with default parameters and a learning rate of 10-6 for 5 epochs. Because we consider λ {1/8, 2/8, 3/8, 4/8, 8/8} and h {Vision Transformer, Res Net50, Ro BERTa}, there are a total of 15 different models for most experiments. To train with a particular λ: for each training input x, we generate two random maskings one where λ of the features are zeroed and one where λ/2 of the features are zeroed. This additional λ/2 zeroing is to account for the fact that inputs to a smoothed model will be subject to masking by λ as well as φ(x), where the scaling factor of 1/2 is informed by our prior experience about the size of a stable explanation.