Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Optimal Decision Tree Policies for Markov Decision Processes
Authors: Daniรซl Vos, Sicco Verwer
IJCAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experiments comparing the performance of OMDTs with VIPER and dtcontrol. ... All of our experiments ran on a Linux machine with 16 Intel Xeon CPU cores and 72 GB of RAM total and used Gurobi 10.0.0 with default parameters. Each method ran on a single CPU core. |
| Researcher Affiliation | Academia | Dani el Vos , Sicco Verwer Delft University of Technology EMAIL |
| Pseudocode | No | The paper describes the OMDT formulation using mathematical equations (1-8) and natural language, but it does not contain a structured pseudocode or algorithm block. |
| Open Source Code | Yes | The full code for OMDT and our experiments can be found on Git Hub3. (Footnote 3: https://github.com/tudelft-cda-lab/OMDT) |
| Open Datasets | Yes | For comparison we implemented 13 environments based on well-known MDPs from the literature, the sizes of these MDPs are given in Table 2. |
| Dataset Splits | No | The paper does not provide explicit training/test/validation dataset splits. Reinforcement learning often involves policy learning within an environment rather than static data splits. |
| Hardware Specification | Yes | All of our experiments ran on a Linux machine with 16 Intel Xeon CPU cores and 72 GB of RAM total and used Gurobi 10.0.0 with default parameters. Each method ran on a single CPU core. |
| Software Dependencies | Yes | All of our experiments ran on a Linux machine with 16 Intel Xeon CPU cores and 72 GB of RAM total and used Gurobi 10.0.0 with default parameters. |
| Experiment Setup | Yes | We consider an OMDT optimal when the relative gap between its objective and bound is proven to be less than 0.01%. We solved OMDTs for a depth of 3 for a maximum of 2 hours and display the results in Table 2. ... All runs were limited to 2 hours. |