Optimal Decision Trees for Nonlinear Metrics

Authors: Emir Demirović, Peter J. Stuckey3733-3741

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The value of our method is given in a dedicated experimental section, where we consider 75 publicly available datasets. Nevertheless, the experiments illustrate that runtimes are reasonable for majority of the tested datasets.
Researcher Affiliation Academia Delft University of Technology, The Netherlands Monash University and Data61, Australia
Pseudocode Yes Pseudo-code for the algorithm is given in Figure 1, where details on bounding the size of the tree in terms of numbers of nodes are elided for simplicity.
Open Source Code Yes Public release. The code and benchmarks are available at bitbucket.org/Emir D/murtree-bi-objective.
Open Datasets Yes We considered 75 binary classification datasets used in previous works (Verwer and Zhang 2019; Aglin, Nijssen, and Schaus 2020; Demirovic et al. 2020; Narodytska et al. 2018; Hu, Rudin, and Seltzer 2019).
Dataset Splits Yes Five-fold cross-validation is used to evaluate each combination of parameters and the parameters that maximises accuracy or F1-score on test set across the folds is selected.
Hardware Specification Yes The experiments were run one at a time on an Intel i7-3612QM@2.10 GHz with 8 GB RAM.
Software Dependencies No The paper mentions using the baseline algorithm Mur Tree (Demirovic et al. 2020) but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We perform hyper-parameter tuning considering parameters depth {1, 2, 3, 4} and size {1, 2, ..., 2depth 1}. Five-fold cross-validation is used to evaluate each combination of parameters and the parameters that maximises accuracy or F1-score on test set across the folds is selected. The timeout is set to one hour for each benchmark.