Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Stability Guarantees for Feature Attributions with Multiplicative Smoothing
Authors: Anton Xue, Rajeev Alur, Eric Wong
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Mu S on vision and language models with various feature attribution methods, such as LIME and SHAP, and demonstrate that Mu S endows feature attributions with non-trivial stability guarantees. |
| Researcher Affiliation | Academia | Anton Xue Rajeev Alur Eric Wong Department of Computer and Information Science University of Pennsylvania Philadelphia, PA 19104 EMAIL |
| Pseudocode | No | The paper describes algorithms in text and figures, but does not include a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code for their methodology or a direct link to a code repository. |
| Open Datasets | Yes | We use Image Net1K [31] as our vision dataset and Tweet Eval [32] sentiment analysis as our language dataset. |
| Dataset Splits | No | The paper mentions training, but does not explicitly state dataset splits (e.g., 80/10/10) for training, validation, and testing. |
| Hardware Specification | No | The paper mentions using specific models like Vision Transformer and ResNet50, but does not provide details on the hardware (e.g., specific GPUs, CPUs) used for experiments. |
| Software Dependencies | No | The paper mentions using Adam [61] as an optimizer, but does not provide version numbers for any software dependencies. |
| Experiment Setup | Yes | Training Details We used Adam [61] as our optimizer with default parameters and a learning rate of 10-6 for 5 epochs. Because we consider λ {1/8, 2/8, 3/8, 4/8, 8/8} and h {Vision Transformer, Res Net50, Ro BERTa}, there are a total of 15 different models for most experiments. To train with a particular λ: for each training input x, we generate two random maskings one where λ of the features are zeroed and one where λ/2 of the features are zeroed. This additional λ/2 zeroing is to account for the fact that inputs to a smoothed model will be subject to masking by λ as well as φ(x), where the scaling factor of 1/2 is informed by our prior experience about the size of a stable explanation. |