Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Improving Decision Trees through the Lens of Parameterized Local Search

Authors: Juha Harviainen, Frank Sommer, Manuel Sorge

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To complement our theoretical findings, we study the potential benefits of local search, that is, is there room for improvement in the decision trees constructed by the heuristics? While it is well known that optimally learning decision trees is NP-hard [16], the heuristics might still perform (almost) optimally on practical instances or at least be optimal in some local neighborhood of models. For instance, recent research [14] has perhaps unexpectedly suggested that the pruning heuristics employed by the commonly used libraries tend to be near-optimal on common benchmark datasets. We observe a similar phenomenon here suggesting near-optimality also locally, that is, while there is some room for improvement on the instances already by performing a single local search operation, the decrement in errors tends to be mild (see Table 1). ... Section 6, we summarize our experiments and their results.
Researcher Affiliation	Academia	Juha Harviainen Department of Computer Science University of Helsinki Helsinki, Finland EMAIL Frank Sommer Institute of Computer Science Friedrich-Schiller Universit at Jena Jena, Germany EMAIL Manuel Sorge Institute of Logic and Computation TU Wien Vienna, Austria EMAIL
Pseudocode	Yes	Algorithm 1 Binary search to populate the unknown thresholds of the decision tree T. The initial call is Compute Thresholds(E, r), where E is the set of examples and r is the root of T. Global Variables: Decision tree T, N V (T) of nodes with unknown threshold, error vector t Function Compute Thresholds (E , n) Input: An example set E E and a node n T. Output: True if and only if we can assign thresholds to all nodes with unknown threshold in the decision tree corresponding to the subtree rooted at n while respecting the error vector t if we only consider the examples E . 1 if n is a leaf then // check if error bound according to t is met 2 Let c be the class label of n and tn be the entry of t corresponding to n 3 if more than tn examples of E do not have label c then return False else return True 4 if n N then z = Binary Search(E , n) // n has an unknown threshold 5 else z = thr(n) 6 p = right child of n, x = Compute Thresholds(E [ffeat(n) > z], p) 7 q = left child of n, y = Compute Thresholds(E [ffeat(n) z], q) 8 return x AND y Function Binary Search (E , n) Input: An example set E E and a node n T. Output: The largest threshold x of feat(n) s.t. we can populate all unknown thresholds in the left subtree of n while respecting the error vector t if we only consider the examples E . 9 Set D to be an array containing all thresholds of feature feat(n) in ascending order 10 Set L = 0, R = \|length(D)\| 1, b = 0, q = left child of n 11 while L R do 12 m = (L + R)/2 13 if Compute Thresholds(E [ffeat(n) D[m]], q) = True then L = m + 1, b = 1 14 else R = m 1, b = 0 15 if b = 1 then return D[m] 16 return D[m 1], with D[ 1] = D[0] 1
Open Source Code	Yes	3A continuously updated version of our paper is available on ar Xiv [12] and the related source code for replicating the experiments on Zenodo [13]. ... Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: The source code, compilation instructions, and used datasets are available in the supplemental material. The unmodified original datasets are publicly available from third parties. If the paper gets accepted, the source code, compilation instructions, and used datasets will be made available in a public repository.
Open Datasets	Yes	We used 47 datasets from the Penn Machine Learning Benchmarks library [38], including 32 previously used for minimum-size tree computation [1, 32, 42] and additional larger datasets. ... Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: The source code, compilation instructions, and used datasets are available in the supplemental material. The unmodified original datasets are publicly available from third parties.
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits. While it mentions preprocessing steps like transforming categorical features, converting to binary classification, and removing examples with identical values, it does not specify how the datasets were partitioned into training, testing, or validation sets for the experiments conducted. The evaluation focuses on existing heuristically computed trees and checking if their misclassifications could be reduced.
Hardware Specification	Yes	We implemented the algorithm in Python 3.10.12 and all experiments were carried out on a personal computer running Ubuntu Linux 22.04 with an 8-core Intel Core i7-9700 processor and 16Gi B RAM (running up to eight instances of the algorithm in parallel).
Software Dependencies	Yes	We implemented the algorithm in Python 3.10.12 and all experiments were carried out on a personal computer running Ubuntu Linux 22.04 with an 8-core Intel Core i7-9700 processor and 16Gi B RAM (running up to eight instances of the algorithm in parallel). ... We computed unpruned and pruned trees using WEKA 3.8.5 s [9] C4.5 implementation [36] J48.
Experiment Setup	Yes	The unpruned trees were obtained by running WEKA s J48 classifier with the flags -no-cv -B -M 0 -J -U. The pruned trees were computed by the replacement heuristic implemented in J48 when run with the flags -no-cv -B -M 0 -J -S. ... We first checked whether the heuristically computed real-world pruned decision trees are locally optimal. That is, for each of the 47 datasets we took the pruned tree and checked whether their misclassifications could be reduced by up to two adjustment or exchange operations.