The Influence of Dimensions on the Complexity of Computing Decision Trees
Authors: Stephen G. Kobourov, Maarten Löffler, Fabrizio Montecchiani, Marcin Pilipczuk, Ignaz Rutter, Raimund Seidel, Manuel Sorge, Jules Wulms
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study this problem with respect to the number d of dimensions of the feature space Rd, which contains n training examples. We show that it can be solved in O(n2d+1) time, but under reasonable complexity-theoretic assumptions it is not possible to achieve f(d) no(d/ log d) running time. The problem is solvable in (d R)O(d R) n1+o(1) time, if there are exactly two classes and R is an upper bound on the number of tree leaves labeled with the first class. |
| Researcher Affiliation | Academia | 1 University of Arizona, Department of Computer Science 2 Utrecht University, Department of Information and Computing Sciences 3 University of Perugia, Department of Engineering 4 University of Warsaw, Faculty of Mathematics, Informatics, and Mechanics 5 University of Passau, Faculty of Computer Science and Mathematics 6 Saarland University, Department of Computer Science 7 TU Wien, Institute of Logic and Computation |
| Pseudocode | Yes | Algorithm 1: SMALLESTDECISIONTREE(E, d, s); Algorithm 2: BINARYSEARCH(E, i, j) |
| Open Source Code | No | The paper does not contain an unambiguous statement about releasing code for the work described, nor does it provide a direct link to a code repository or mention code in supplementary materials. |
| Open Datasets | No | The paper is theoretical and defines 'E' as a set of examples, but it does not use specific, publicly available datasets for empirical training or provide access information for any dataset. |
| Dataset Splits | No | The paper is theoretical and does not discuss empirical experiments. Therefore, it does not provide specific dataset split information (e.g., training, validation, test splits) needed for data partitioning and reproduction of experiments. |
| Hardware Specification | No | The paper is theoretical and focuses on algorithms and complexity analysis, not empirical experiments. Therefore, it does not describe any specific hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and describes algorithms and complexity analysis. It does not mention any specific software dependencies or version numbers required to replicate its findings. |
| Experiment Setup | No | The paper is theoretical and focuses on algorithm design and complexity. It does not describe any empirical experimental setup details, hyperparameters, or training configurations. |