Improving Decision Sparsity

Authors: Yiyang Sun, Tong Wang, Cynthia Rudin

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate whether our proposed methods would achieve sparser, more credible and closer explanations, we present experiments on seven datasets: (i) UCI Adult Income dataset for predicting income levels [Dua and Graff, 2017], (ii) FICO Home Equity Line of Credit Dataset for assessing credit risk, used for the Explainable Machine Learning Challenge [FICO, 2018], (iii) UCI German Credit dataset for determining creditworthiness [Dua and Graff, 2017], (iv) MIMIC-III dataset for predicting patient outcomes in intensive care units [Johnson et al., 2016a,b], (v) COMPAS dataset [Jeff Larson and Angwin, 2016, Wang et al., 2022a] for predicting recidivism, (vi) Diabetes dataset [Strack et al., 2014] for predicting whether patients will be re-admitted within two years, and (vii) Headline dataset for predicting whether the headline is likely to be shared by readers [Chen et al., 2023].
Researcher Affiliation Academia Yiyang Sun Duke University Tong Wang Yale University Cynthia Rudin Duke University
Pseudocode Yes Algorithm 1 Reference Search for Flexible SEV (Appendix D). Algorithm 2 Preprocessing Information collection process for SEVT (Appendix E). Algorithm 3 Efficient SEVT Calculation Negative Pathways Check (Appendix E).
Open Source Code Yes Yes, we have provided the code for training, and evaluation in the Experiment folder, and the script for running in Script folder.
Open Datasets Yes UCI Adult Income dataset for predicting income levels [Dua and Graff, 2017], FICO Home Equity Line of Credit Dataset for assessing credit risk [FICO, 2018], MIMIC-III dataset for predicting patient outcomes in intensive care units [Johnson et al., 2016a,b], COMPAS dataset [Jeff Larson and Angwin, 2016, Wang et al., 2022a], Diabetes dataset [Strack et al., 2014], and Headline dataset [Chen et al., 2023].
Dataset Splits No The datasets were divided into training and test sets using an 80-20 stratification. The paper specifies train and test splits but does not explicitly provide percentages or counts for a separate validation split.
Hardware Specification Yes All the models are trained using a RTX2080Ti GPU, and with 4 core in Intel(R) Xeon(R) Gold 6226 CPU @ 2.70GHz.
Software Dependencies No Baseline models were fit using sklearn [Pedregosa et al., 2011] implementations in Python. The resulting loss was minimized via gradient descent in Py Torch [Paszke et al., 2019]. (This mentions software packages but does not provide specific version numbers for them.)
Experiment Setup Yes The 2-layer MLP used ReLU activation and consisted of two fully-connected layers with 128 nodes each. It was trained with early stopping. The gradient-boosted classifier used 200 trees with a max depth of 3. The resulting loss was minimized via gradient descent in Py Torch [Paszke et al., 2019], with a batch size of 128, a learning rate of 0.1, and the Adam optimizer. The first 80 training epochs are warm-up epochs optimizing just Binary Cross Entropy Loss for classification (BCELoss). The next 20 epochs add the All-Opt terms.