Optimal Decision Tree Policies for Markov Decision Processes
Authors: Daniƫl Vos, Sicco Verwer
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experiments comparing the performance of OMDTs with VIPER and dtcontrol. ... All of our experiments ran on a Linux machine with 16 Intel Xeon CPU cores and 72 GB of RAM total and used Gurobi 10.0.0 with default parameters. Each method ran on a single CPU core. |
| Researcher Affiliation | Academia | Dani el Vos , Sicco Verwer Delft University of Technology {d.a.vos, s.e.verwer}@tudelft.nl |
| Pseudocode | No | The paper describes the OMDT formulation using mathematical equations (1-8) and natural language, but it does not contain a structured pseudocode or algorithm block. |
| Open Source Code | Yes | The full code for OMDT and our experiments can be found on Git Hub3. (Footnote 3: https://github.com/tudelft-cda-lab/OMDT) |
| Open Datasets | Yes | For comparison we implemented 13 environments based on well-known MDPs from the literature, the sizes of these MDPs are given in Table 2. |
| Dataset Splits | No | The paper does not provide explicit training/test/validation dataset splits. Reinforcement learning often involves policy learning within an environment rather than static data splits. |
| Hardware Specification | Yes | All of our experiments ran on a Linux machine with 16 Intel Xeon CPU cores and 72 GB of RAM total and used Gurobi 10.0.0 with default parameters. Each method ran on a single CPU core. |
| Software Dependencies | Yes | All of our experiments ran on a Linux machine with 16 Intel Xeon CPU cores and 72 GB of RAM total and used Gurobi 10.0.0 with default parameters. |
| Experiment Setup | Yes | We consider an OMDT optimal when the relative gap between its objective and bound is proven to be less than 0.01%. We solved OMDTs for a depth of 3 for a maximum of 2 hours and display the results in Table 2. ... All runs were limited to 2 hours. |