Feature Learning for Interpretable, Performant Decision Trees
Authors: Jack Good, Torin Kovach, Kyle Miller, Artur Dubrawski
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This section contains evaluation and demonstration of interpretable models. Comprehensive results, as well as additional experiment details, are in the supplementary material. Unless otherwise noted, all results are from crisp trees. 4.1 Benchmarks We compare various configurations of our algorithm against popular tree-based baselines including decision trees, random forests, and Extra Trees. We report 10-fold cross validation accuracy and average number of splits in the model. |
| Researcher Affiliation | Academia | Jack H. Good, Torin Kovach, Kyle Miller, Artur Dubrawski Carnegie Mellon University {jhgood,tkovach,mille856,awd}@andrew.cmu.edu |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. Methods are described textually and through mathematical formulations. |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | The data sets are selected from among the the most viewed tabular classification data sets on the UCI machine learning repository [14] at the time of writing. ... Results for MNIST trees. ... Table 1: Results of tabular data benchmarks. Number of attributes p is listed before and after one-hot encoding categorical attributes. ... iris [18], heart-disease [30], dry-bean [31], wine [1], car [5], wdbc [44], sonar [38], pendigits [2], ionosphere [39] |
| Dataset Splits | Yes | We report 10-fold cross validation accuracy and average number of splits in the model. ... Our models and the conventional decision trees have cost-complexity pruning α selected by cross-validation. |
| Hardware Specification | No | No specific hardware details (e.g., GPU models, CPU types, memory) used for running experiments are provided in the paper. |
| Software Dependencies | No | The paper mentions various algorithms and tools (e.g., CART, Random Forests, XGBoost, scikit-learn) but does not provide specific version numbers for any software dependencies needed to replicate the experiment. |
| Experiment Setup | Yes | Categorical attributes are one-hot encoded, and the data is normalized to mean 0 and standard deviation 1. For our models, we show in the main paper results for linear features and distance-to-prototype features with diagonal inverse covariance. Each is regularized with L1 coefficient λ1 = .01 to promote sparsity. Our models and the conventional decision trees have cost-complexity pruning α selected by cross-validation. Other hyperparameters are fixed and described in the supplementary material. |