Reliable Post hoc Explanations: Modeling Uncertainty in Explainability
Authors: Dylan Slack, Anna Hilgard, Sameer Singh, Himabindu Lakkaraju
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental evaluation with multiple real world datasets and user studies demonstrate that the efficacy of the proposed framework. |
| Researcher Affiliation | Academia | Dylan Slack UC Irvine dslack@uci.edu Sophie Hilgard Harvard University ash798@g.harvard.edu Sameer Singh UC Irvine sameer@uci.edu Himabindu Lakkaraju Harvard University hlakkaraju@hbs.edu |
| Pseudocode | Yes | Algorithm 1 Focused sampling for local explanations |
| Open Source Code | Yes | Project Page: https://dylanslacks.website/reliable/index.html |
| Open Datasets | Yes | Our first structured dataset is COMPAS [27]... The second structured dataset is the German Credit dataset from the UCI repository [28]... We also include popular image datasets MNIST [29] and Imagenet [30]. |
| Dataset Splits | No | We create 80/20 train/test splits for these two datasets [COMPAS, German Credit]... The paper does not explicitly state a separate validation split or the methodology for one. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory, cloud resources) used to run the experiments. |
| Software Dependencies | No | We train a random forest classifier (sklearn implementation with 100 estimators)... The paper mentions 'sklearn' but does not specify a version number for this or any other software dependency. |
| Experiment Setup | Yes | We create 80/20 train/test splits... and train a random forest classifier (sklearn implementation with 100 estimators)... For the MNIST... we train a 2-layer CNN... For generating explanations, we use standard implementations of the baselines LIME and Kernel SHAP with default settings... For our framework, the desired level of certainty is expressed as the width of the 95% credible interval... We use S = 200 as the initial number of perturbations... During focused sampling, we set the batch size B to 50. |