Efficient Inference of Optimal Decision Trees
Authors: Florent Avellaneda3195-3202
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we propose a novel approach for inferring an optimal decision tree with a minimum depth based on the incremental generation of Boolean formulas. The experimental results indicate that it scales sufficiently well and the time it takes to run grows slowly with the size of datasets. |
| Researcher Affiliation | Academia | Florent Avellaneda Computer Research Institute of Montreal florent.avellaneda@crim.ca |
| Pseudocode | Yes | Algorithm 1 (Generate Feature Constraints) [...] Algorithm 2 (Generate Class Constraints) [...] Algorithm 3 (Infer Decision Tree) |
| Open Source Code | Yes | Our two algorithms were implemented1 in C++ calling the SAT solver Mini SAT (E en and S orensson 2003) and we ran experiments on Ubuntu with Intel Core CPU i7-2600K @ 3.40GHz and we limit the memory usage to 2 GB. We compared our algorithms with the one of Bessiere, Hebrard, and O Sullivan (2009), denoted DT2, and the one of Narodytska et al. (2018), denoted DT1. We refer to the results they have published. Note that the characteristics of the machines and tools they used are similar to ours or even better for DT1 which uses Intel Xeon 3.50GHz processor with a 2009 SAT solver. Note that additional benchmarks are available at https://github.com/Florent Avellaneda/ Infer DT. 1See https://github.com/Florent Avellaneda/Infer DT |
| Open Datasets | Yes | The datasets we used are extracted from the paper of Verwer and Zhang (2019) and are available at https://github.com/Sicco Verwer/binoct. |
| Dataset Splits | Yes | The last column corresponds the 10-fold cross-validations. [...] Each dataset corresponds to a 5-fold cross-validation. |
| Hardware Specification | Yes | Our two algorithms were implemented1 in C++ calling the SAT solver Mini SAT (E en and S orensson 2003) and we ran experiments on Ubuntu with Intel Core CPU i7-2600K @ 3.40GHz and we limit the memory usage to 2 GB. |
| Software Dependencies | No | The paper states 'calling the SAT solver Mini SAT (E en and S orensson 2003)', which refers to a specific publication of the solver but does not provide a specific version number for Mini SAT. |
| Experiment Setup | Yes | The value of k is initially 1, and while the algorithm is not finding a solution the value k is increased. [...] Then it performs a dichotomic search on the number of nodes allowed between 1 and 2k+1 1 to find a decision tree with a minimal number of nodes. [...] we limit the memory usage to 2 GB. |