CXPlain: Causal Explanations for Model Interpretation under Uncertainty
Authors: Patrick Schwab, Walter Karlen
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experiments that demonstrate that CXPlain is significantly more accurate and faster than existing model-agnostic methods for estimating feature importance. |
| Researcher Affiliation | Academia | Patrick Schwab and Walter Karlen Institute of Robotics and Intelligent Systems, ETH Zurich patrick.schwab@hest.ethz.ch |
| Pseudocode | Yes | We repeat this process M times to obtain a bootstrap ensemble of M explanation models (Algorithm in Appendix B). |
| Open Source Code | Yes | Source code is available at https://github.com/d909b/cxplain. |
| Open Datasets | Yes | To compare the accuracy of CXPlain to existing state-of-the-art methods for feature importance estimation, we evaluated its ability to identify important features in MNIST [51] and Image Net [52] images. |
| Dataset Splits | No | The paper references using MNIST and Image Net test sets, and discusses training on subsets via bootstrap resampling, but it does not specify explicit training/validation/test dataset splits with percentages or counts in the main text. |
| Hardware Specification | Yes | We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPUs used for this research. |
| Software Dependencies | No | The paper mentions TensorFlow [54] but does not provide specific version numbers for it or any other key software dependencies used in the experiments. |
| Experiment Setup | No | As a preprocessing step, pixel values were scaled to be in the range of [0, 1] prior to training. We then used several importance estimation methods to determine which input pixels were most important for the classification models decisions on N = 100 test images. We masked the top 10 and 30% of those most important pixels for MNIST and Image Net, respectively, and measured the resulting change in the classification models confidences by computing the difference in log odds. Further training details are given in Appendix A. |