Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

SORTeD Rashomon Sets of Sparse Decision Trees: Anytime Enumeration

Authors: Elif Arslan, Jacobus van der Linden, Serge Hoogendoorn, Marco Rinaldi, Emir Demirović

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct a series of experiments with the following aims: (1) to assess SORTD s runtime efficiency in computing Rashomon sets; (2) to showcase that a small number of high-quality trees easily found by SORTD may be informative for model evaluation via variable importance analysis; and (3) to demonstrate SORTD s flexibility in enumerating and analysing Rashomon sets under varying objective functions.
Researcher Affiliation Academia Elif Arslan Delft University of Technology, Netherlands EMAIL Jacobus G. M. van der Linden Delft University of Technology, Netherlands J.G.M.vander EMAIL Serge Hoogendoorn Delft University of Technology, Netherlands EMAIL Marco Rinaldi Delft University of Technology, Netherlands EMAIL Emir Demirovi c Delft University of Technology, Netherlands EMAIL
Pseudocode Yes Alg. 1 shows how a search node computes its next best solution. [...] Algorithm 1 Get Next Solution() [...] Algorithm 2 Explore Candidates(...) [...] Alg. 3 adapts the depth-two subroutine [...] Algorithm 3 Calculate Three Node Sols(...) [...] Alg. 4 summarizes how the Rashomon set is computed [...] Algorithm 4 Main(...) [...] Alg. 5 outlines the procedure for generating all trees with a single branching node. [...] Algorithm 5 Calculate One Node Sol(...) [...] Alg. 6 handles the case in which the right child is a leaf [...] Algorithm 6 Calculate Two Node Sols(...)
Open Source Code Yes We implemented SORTD in C++ and provide it as a python package.1 1https://github.com/ConSol-Lab/pysortd
Open Datasets Yes For aims (1) and (2), we use the 30 benchmark binary classification datasets previously used to assess state-of-the-art methods [10, 15, 16, 30, 46]. [...] The original datasets can be obtained from the UCI Machine Learning repository [55] and from [51, 52, 56, 57]. For aim (3) we adopt common regression [47] and fairness benchmark datasets [48].
Dataset Splits Yes Each dataset was bootstrapped 20 times. [...] We run SORTD using the regularized accuracy objective, so each leaf node is penalized with the sparsity penalty λ. [...] CART and STree D are run repeatedly within that time budget on random samples of 50% of the total dataset.
Hardware Specification Yes All experiments are run single-threaded on an Intel Xeon E5-6448Y @ 2.1 GHz with 100 GB RAM [49], with a 300 seconds time limit.
Software Dependencies No We implemented SORTD in C++ and provide it as a python package. (Explanation: Although programming languages are mentioned, no specific version numbers for Python, C++, or any specific libraries/solvers are provided.)
Experiment Setup Yes We varied the depth budget d {3, 4, 5} and the complexity cost λ {0.001, 0.01, 0.1}. [...] with a 300 seconds time limit. [...] We use λ = 0.01 and a depth budget of four. [...] for max-depth four, λ = 0.001, and ε = 0.1. [...] with max-depth d = 3, discrimination limit δ = 1%, and sparsity penalty λ = 0.01 (for SORTD).