Explanations of Black-Box Models based on Directional Feature Interactions

Authors: Aria Masoomi, Davin Hill, Zhonghui Xu, Craig P Hersh, Edwin K. Silverman, Peter J. Castaldi, Stratis Ioannidis, Jennifer Dy

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply our bivariate method on Shapley value explanations, and experimentally demonstrate the ability of directional explanations to discover feature interactions. We show the superiority of our method against state-of-the-art on CIFAR10, IMDB, Census, Divorce, Drug, and gene data.
Researcher Affiliation Academia 1Northeastern University, Department of Electrical and Computer Engineering, Boston, MA, USA. 2Brigham and Women s Hospital, Channing Division of Network Medicine, Boston, MA, USA
Pseudocode Yes Algorithm 1 Approximate Graph G with Shapley Sampling Algorithm
Open Source Code Yes All source code is publicly available.3 (Footnote 3: https://github.com/davinhill/Bivariate Shapley)
Open Datasets Yes We evaluate our methods on COPDGene (Regan et al., 2010), CIFAR10 (Krizhevsky, 2009) and MNIST (Le Cun & Cortes, 2010) image data, IMDB text data, and on three tabular UCI datasets (Drug, Divorce, and Census) (Dua & Graff, 2017).
Dataset Splits No Table 3 'Summary of the datasets and models in our investigation' provides 'Train/Test Samples' counts (e.g., '1,641/407' for COPD) but does not specify a separate validation split or the methodology for cross-validation.
Hardware Specification Yes All experiments are performed on an internal cluster with Intel Xeon Gold 6132 CPUs and Nvidia Tesla V100 GPUs.
Software Dependencies No The paper mentions several software packages and libraries such as Network X, Scikit-Network, kernel SHAP, Pytorch Geometric, NLTK, GloVe, Adam, and XGBoost, but it does not specify version numbers for any of them (e.g., 'We use the package Network X (Schult, 2008)').
Experiment Setup Yes The paper provides specific experimental setup details for each dataset and model in Section G.1.3. For example, for COPDGene, it states: 'We use a neural network with 4 fully-connected layers of 200 hidden units, batch normalization, and relu activation. The model is trained using Adam (Kingma & Ba, 2017) with learning rate 10 3 for 800 epochs'.