MAPTree: Beating “Optimal” Decision Trees with Bayesian Decision Trees

Authors: Colin Sullivan, Mo Tiwari, Sebastian Thrun

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On 16 real world datasets, MAPTree either outperforms baselines or demonstrates comparable performance but with much smaller trees. On a synthetic dataset, MAPTree also demonstrates greater robustness to noise and better generalization than existing approaches.
Researcher Affiliation Academia Department of Computer Science, Stanford University colins26@stanford.edu, motiwari@stanford.edu, thrun@stanford.edu
Pseudocode Yes Algorithm 1: MAPTree Input: Root OR Node r, cost function cost, and heuristic function h for AND/OR graph G Output: Solution graph S
Open Source Code Yes The code for our experiments is available at https://github.com/Thrun Group/maptree.
Open Datasets Yes We evaluate the performance of MAPTree in multiple settings. ... In the third setting, we measure the generalization accuracy, log likelihood, and tree size of models generated by MAPTree and baseline algorithms across all 16 datasets from the CP4IM dataset repository (Guns, Nijssen, and De Raedt 2011).
Dataset Splits No No specific training/validation dataset split information is provided beyond the use of 'stratified 10-fold' for testing, without detailing the internal partitioning of folds for training/validation.
Hardware Specification No No specific hardware details (such as GPU/CPU models or memory) used for running experiments are provided.
Software Dependencies No The paper mentions a 'heavily optimized C++ implementation that is also callable from Python' but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes In all experiments in this section, we set α = 0.95 and β = 0.5.