reproducibilityindex.ai

FastSHAP: Real-Time Shapley Value Estimation

Authors: Neil Jethani, Mukund Sudarshan, Ian Connick Covert, Su-In Lee, Rajesh Ranganath

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments with tabular and image datasets, we compare Fast SHAP to existing estimation approaches and ﬁnd that it generates accurate explanations with an orders-of-magnitude speedup.
Researcher Affiliation	Academia	Neil Jethani New York University Mukund Sudarshan New York University Ian Covert University of Washington Su-In Lee University of Washington Rajesh Ranganath New York University
Pseudocode	Yes	Algorithm 1: Fast SHAP training
Open Source Code	Yes	Code to implement Fast SHAP is available online in two separate repositories: https://github. com/iancovert/fastshap contains a Py Torch implementation and https://github. com/neiljethani/fastshap/ a Tensor Flow implementation, both with examples of tabular and image data experiments.
Open Datasets	Yes	Our experiments use data from a 1994 United States census, a bank marketing campaign, bankruptcy statistics, and online news articles (Dua and Graff, 2017). The census data contains 12 input features, and the binary label indicates whether a person makes over $50K a year (Kohavi et al., 1996). The marketing dataset contains 17 input features, and the label indicates whether the customer subscribed to a term deposit (Moro et al., 2014). The bankruptcy dataset contains 96 features describing various companies and whether they went bankrupt (Liang et al., 2016). The news dataset contains 60 numerical features about articles published on Mashable, and our label indicates whether the share count exceeds the median number (1400) (Fernandes et al., 2015). The datasets were each split 80/10/10 for training, validation and testing.
Dataset Splits	Yes	The datasets were each split 80/10/10 for training, validation and testing.
Hardware Specification	Yes	The image experiments were run using 8 cores of an Intel Xeon Gold 6148 processor and a single NVIDIA Tesla V100.
Software Dependencies	No	The paper mentions software like PyTorch, TensorFlow, Light GBM, XGBoost, shap package, and tf-explain package, but does not specify their version numbers.
Experiment Setup	Yes	The models are trained using Adam with a learning rate of 10 3, and we use a learning rate scheduler that multiplies the learning rate by a factor of 0.5 after 3 epochs of no validation loss improvement. Early stopping was triggered after the validation loss ceased to improve for 10 epochs. Each model is trained using Adam with a learning rate of 10 3, and we use a learning rate scheduler that multiplies the learning rate by a factor of 0.8 after 3 epochs of no validation loss improvement. Early stopping was triggered after the validation loss ceased to improve for 20 epochs.