Feature Learning for Interpretable, Performant Decision Trees

Authors: Jack Good, Torin Kovach, Kyle Miller, Artur Dubrawski

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section contains evaluation and demonstration of interpretable models. Comprehensive results, as well as additional experiment details, are in the supplementary material. Unless otherwise noted, all results are from crisp trees. 4.1 Benchmarks We compare various configurations of our algorithm against popular tree-based baselines including decision trees, random forests, and Extra Trees. We report 10-fold cross validation accuracy and average number of splits in the model.
Researcher Affiliation Academia Jack H. Good, Torin Kovach, Kyle Miller, Artur Dubrawski Carnegie Mellon University {jhgood,tkovach,mille856,awd}@andrew.cmu.edu
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks. Methods are described textually and through mathematical formulations.
Open Source Code No The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes The data sets are selected from among the the most viewed tabular classification data sets on the UCI machine learning repository [14] at the time of writing. ... Results for MNIST trees. ... Table 1: Results of tabular data benchmarks. Number of attributes p is listed before and after one-hot encoding categorical attributes. ... iris [18], heart-disease [30], dry-bean [31], wine [1], car [5], wdbc [44], sonar [38], pendigits [2], ionosphere [39]
Dataset Splits Yes We report 10-fold cross validation accuracy and average number of splits in the model. ... Our models and the conventional decision trees have cost-complexity pruning α selected by cross-validation.
Hardware Specification No No specific hardware details (e.g., GPU models, CPU types, memory) used for running experiments are provided in the paper.
Software Dependencies No The paper mentions various algorithms and tools (e.g., CART, Random Forests, XGBoost, scikit-learn) but does not provide specific version numbers for any software dependencies needed to replicate the experiment.
Experiment Setup Yes Categorical attributes are one-hot encoded, and the data is normalized to mean 0 and standard deviation 1. For our models, we show in the main paper results for linear features and distance-to-prototype features with diagonal inverse covariance. Each is regularized with L1 coefficient λ1 = .01 to promote sparsity. Our models and the conventional decision trees have cost-complexity pruning α selected by cross-validation. Other hyperparameters are fixed and described in the supplementary material.