reproducibilityindex.ai

Beyond Sparsity: Tree Regularization of Deep Models for Interpretability

Authors: Mike Wu, Michael Hughes, Sonali Parbhoo, Maurizio Zazzi, Volker Roth, Finale Doshi-Velez

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Using intuitive toy examples as well as medical tasks for treating sepsis and HIV, we demonstrate that this new tree regularization yields models that are easier for humans to simulate than simpler L1 or L2 penalties without sacriﬁcing predictive power.
Researcher Affiliation	Academia	1Stanford University, wumike@cs.stanford.edu 2Harvard University SEAS, mike@michaelchughes.com, ﬁnale@seas.harvard.edu 3University of Basel, {sonali.parbhoo,volker.roth}@unibas.ch 4University of Siena, maurizio.zazzi@unisi.it
Pseudocode	Yes	Algorithm 1 Average-Path-Length Cost Function Require: ˆy( , W) : binary prediction function, with parameters W D = {xn}N n=1 : reference dataset with N examples 1: function Ω(W) 2: tree TRAINTREE({xn, ˆy(xn, W)}) 3: return 1 N n PATHLENGTH(tree, xn)
Open Source Code	Yes	We have released an open-source Python toolbox to allow others to experiment with tree regularization 1. 1http://github.com/dtak/tree-regularization-public
Open Datasets	Yes	We study time-series data for 11 786 septic ICU patients from the public MIMIC III dataset (Johnson et al. 2016). We use the Eu Resist Integrated Database (Zazzi et al. 2012) for 53 236 patients diagnosed with HIV. We have recordings of 630 speakers... (Garofolo and others 1993).
Dataset Splits	Yes	Sepsis Critical Care: ...7 070 patients are used in training, 1 769 for validation, and 294 for test. HIV Therapy Outcome (HIV): ...37 618 patients are used for training; 7 986 for testing, and 7 632 for validation. Phonetic Speech (TIMIT): ...6 303 sequences, split into 3 697 for training, 925 for validation, and 1 681 for testing.
Hardware Specification	No	The paper mentions that computations were supported by "the FAS Research Computing Group at Harvard and sci CORE (http://scicore.unibas.ch/) scientiﬁc computing core facility at University of Basel" but does not provide specific hardware details such as GPU or CPU models.
Software Dependencies	No	The paper mentions software like "Python s scikit-learn" and "Autograd" but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	The objective in equation 1 was optimized via Adam gradient descent (Kingma and Ba 2014) using a batch size of 100 and a learning rate of 1e-3 for 250 epochs, and hyperparameters were set via cross validation using grid search (see supplement for full experimental details). Optimization of our surrogate objective is done via gradient descent. We use Autograd to compute gradients of the loss in Eq. (5) with respect to ξ, then use Adam to compute descent directions with step sizes set to 0.01 for toy datasets and 0.001 for real world datasets.