Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Optimal Decision Trees for Nonlinear Metrics
Authors: Emir Demirović, Peter J. Stuckey3733-3741
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The value of our method is given in a dedicated experimental section, where we consider 75 publicly available datasets. Nevertheless, the experiments illustrate that runtimes are reasonable for majority of the tested datasets. |
| Researcher Affiliation | Academia | Delft University of Technology, The Netherlands Monash University and Data61, Australia |
| Pseudocode | Yes | Pseudo-code for the algorithm is given in Figure 1, where details on bounding the size of the tree in terms of numbers of nodes are elided for simplicity. |
| Open Source Code | Yes | Public release. The code and benchmarks are available at bitbucket.org/Emir D/murtree-bi-objective. |
| Open Datasets | Yes | We considered 75 binary classification datasets used in previous works (Verwer and Zhang 2019; Aglin, Nijssen, and Schaus 2020; Demirovic et al. 2020; Narodytska et al. 2018; Hu, Rudin, and Seltzer 2019). |
| Dataset Splits | Yes | Five-fold cross-validation is used to evaluate each combination of parameters and the parameters that maximises accuracy or F1-score on test set across the folds is selected. |
| Hardware Specification | Yes | The experiments were run one at a time on an Intel i7-3612QM@2.10 GHz with 8 GB RAM. |
| Software Dependencies | No | The paper mentions using the baseline algorithm Mur Tree (Demirovic et al. 2020) but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We perform hyper-parameter tuning considering parameters depth {1, 2, 3, 4} and size {1, 2, ..., 2depth 1}. Five-fold cross-validation is used to evaluate each combination of parameters and the parameters that maximises accuracy or F1-score on test set across the folds is selected. The timeout is set to one hour for each benchmark. |