reproducibilityindex.ai

Optimal Decision Tree Policies for Markov Decision Processes

Authors: Daniël Vos, Sicco Verwer

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present experiments comparing the performance of OMDTs with VIPER and dtcontrol. ... All of our experiments ran on a Linux machine with 16 Intel Xeon CPU cores and 72 GB of RAM total and used Gurobi 10.0.0 with default parameters. Each method ran on a single CPU core.
Researcher Affiliation	Academia	Dani el Vos , Sicco Verwer Delft University of Technology {d.a.vos, s.e.verwer}@tudelft.nl
Pseudocode	No	The paper describes the OMDT formulation using mathematical equations (1-8) and natural language, but it does not contain a structured pseudocode or algorithm block.
Open Source Code	Yes	The full code for OMDT and our experiments can be found on Git Hub3. (Footnote 3: https://github.com/tudelft-cda-lab/OMDT)
Open Datasets	Yes	For comparison we implemented 13 environments based on well-known MDPs from the literature, the sizes of these MDPs are given in Table 2.
Dataset Splits	No	The paper does not provide explicit training/test/validation dataset splits. Reinforcement learning often involves policy learning within an environment rather than static data splits.
Hardware Specification	Yes	All of our experiments ran on a Linux machine with 16 Intel Xeon CPU cores and 72 GB of RAM total and used Gurobi 10.0.0 with default parameters. Each method ran on a single CPU core.
Software Dependencies	Yes	All of our experiments ran on a Linux machine with 16 Intel Xeon CPU cores and 72 GB of RAM total and used Gurobi 10.0.0 with default parameters.
Experiment Setup	Yes	We consider an OMDT optimal when the relative gap between its objective and bound is proven to be less than 0.01%. We solved OMDTs for a depth of 3 for a maximum of 2 hours and display the results in Table 2. ... All runs were limited to 2 hours.