reproducibilityindex.ai

Interpretable Generalized Additive Models for Datasets with Missing Values

Authors: Hayden McTavish, Jon Donnelly, Margo Seltzer, Cynthia Rudin

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now evaluate the performance, runtime, and sparsity of M-GAM in comparison to other methods. To evaluate M-GAM in a realistic setting, we primarily consider four datasets: the Explainable Machine Learning Challenge dataset (FICO et al., 2018) (referred to as FICO), a breast cancer dataset introduced by Razavi et al. (2018) (referred to as Breast Cancer), the MIMIC-III critical care dataset (Johnson et al., 2016) (referred to as MIMIC), and a dataset concerning the prediction of pharyngitis introduced by Miyagi (2023) (referred to as Pharyngitis). ... We use AUC, rather than accuracy, when evaluating model performance for Breast Cancer and MIMIC because these two datasets are heavily imbalanced.
Researcher Affiliation	Academia	Hayden Mc Tavish Department of Computer Science Duke University Durham, NC 27705 hayden.mctavish@duke.edu Jon Donnelly* Department of Computer Science Duke University Durham, NC 27705 jon.donnelly@duke.edu Margo Seltzer Department of Computer Science University of British Columbia Vancouver, BC V6T 1Z4 mseltzer@cs.ubc.ca Cynthia Rudin Department of Computer Science Duke University Durham, NC 27705 cynthia@cs.duke.edu
Pseudocode	No	The paper provides mathematical definitions, propositions, and theorems, such as Definition 3.3 and Theorem 3.4, and optimization problems (Equation 3 and 4). However, it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured, code-like steps for its methods.
Open Source Code	Yes	The code used for this work is available at https://github.com/jdonnelly36/M-GAM.
Open Datasets	Yes	We primarily consider four datasets: the Explainable Machine Learning Challenge dataset (FICO et al., 2018) (referred to as FICO), a breast cancer dataset introduced by Razavi et al. (2018) (referred to as Breast Cancer), the MIMIC-III critical care dataset (Johnson et al., 2016) (referred to as MIMIC), and a dataset concerning the prediction of pharyngitis introduced by Miyagi (2023) (referred to as Pharyngitis).
Dataset Splits	Yes	For all experiments that did not report complete sparsity versus accuracy curves, we used 5-fold cross validation to select the value for the ℓ0 sparsity penalty.
Hardware Specification	Yes	All experiments that involved timing were conducted using one Tensor TXR231-1000R D126 Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz (512GB RAM 40 cores), except for MIWAE timing experiments, which use one NVIDIA Tesla P100 GPU.
Software Dependencies	No	We fit all M-GAMs using Fast Sparse (Liu et al., 2022)... We fit all non-sparse GAM s using SKLearn s implementation of logistic regression over binned data... Cross validation was performed using 5 folds via Grid Search CV from SKLearn (Pedregosa et al., 2011), and the SKLearn implementation was used for each model class considered other than XGBoost.
Experiment Setup	Yes	For every GAM we fit (M-GAM, Fast Sparse, and non-L0 GAMs), we created an indicator for each of 8 quantiles (the 0.125 quantile, the 0.25 quantile, and so on). ... We searched over the following set of values for λ for each GAM: 20, 10, 5, 2, 1, 0.5, 0.4, 0.2, 0.1, 0.05, 0.02, 0.01, and 0.005. ... The hyperparameters we considered are: Logistic regression: { C :[0.01, 0.1, 1, 10], penalty : ( l2 ), max iter : [10,000], tol : [5e-2]}... [and other models]