Understanding Global Feature Contributions With Additive Importance Measures

Authors: Ian Covert, Scott M. Lundberg, Su-In Lee

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that SAGE can be calculated efficiently and that it assigns more accurate importance values than other methods. 5 Experiments We now evaluate SAGE by comparing it with several baseline methods. For simplicity we only consider model-agnostic baselines, including permutation tests, mean importance, feature ablation and univariate predictors (see Section 2.3). For datasets, we used MNIST [19], a bike sharing demand dataset [10], the German credit quality dataset [21], the Portuguese bank marketing dataset [26], and a breast cancer (BRCA) subtype classification dataset [4, 39].
Researcher Affiliation Collaboration Ian C. Covert University of Washington Seattle, WA icovert@uw.edu Scott Lundberg Microsoft Research Redmond, WA scott.lundberg@microsoft.com Su-In Lee University of Washington Seattle, WA suinlee@uw.edu
Pseudocode Yes Supplement D describes the SAGE sampling algorithm (Algorithm 1) and the changes to its properties in more detail.
Open Source Code Yes 1http://github.com/iancovert/sage/
Open Datasets Yes For datasets, we used MNIST [19], a bike sharing demand dataset [10], the German credit quality dataset [21], the Portuguese bank marketing dataset [26], and a breast cancer (BRCA) subtype classification dataset [4, 39].
Dataset Splits Yes For datasets, we used MNIST [19]... Figure 3: Identifying corrupted features with SAGE. ... Top right: SAGE comparison to identify corruption in month feature. Validation Month +1
Hardware Specification No The paper describes the datasets and models used (e.g., XGBoost, CatBoost, MLP) but does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory, or cloud resources) on which the experiments were run.
Software Dependencies No The paper mentions specific software libraries like 'XGBoost [8]', 'Cat Boost [29]', and 'regularized logistic regression', but it does not provide specific version numbers for these software dependencies.
Experiment Setup No The paper mentions the models used for each dataset (e.g., 'XGBoost for the bike data', 'Cat Boost for the bank and credit data', 'regularized logistic regression for the BRCA data', 'multi-layer perceptron (MLP) for MNIST') but does not provide specific experimental setup details such as hyperparameter values, learning rates, batch sizes, or training schedules.