Saturated Path-Constrained MDP: Planning under Uncertainty and Deterministic Model-Checking Constraints
Authors: Jonathan Sprauel, Andrey Kolobov, Florent Teichteil-Königsbuch
AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We propose a dynamic programming-based algorithm for finding such policies, and empirically demonstrate this algorithm to be orders of magnitude faster than its next-best alternative. ... The objectives of our experiments are two-fold: (1) to compare SPC VI with the algorithm for PC MDPs, PC MDP-ILP (Teichteil-Konigsbuch 2012), in terms of efficiency, and (2) to validate SPC MDPs as an efficient modeling tool. The experiments were run with 5.8 GB of RAM on a 2.80GHz CPU. |
| Researcher Affiliation | Collaboration | Jonathan Sprauel and Florent Teichteil-K onigsbuch {jonathan.sprauel, florent.teichteil}@onera.fr ONERA The French Aerospace Lab 2 Avenue Edouard-Belin, F-31055 Toulouse, France Andrey Kolobov akolobov@microsoft.com Microsoft Research Redmond, WA-98052, USA |
| Pseudocode | Yes | Algorithm 1: SPC MDP Value Iteration, Algorithm 2: explore(S, Tip) function, Algorithm 3: update Reachability(S, ξi, θ) function |
| Open Source Code | No | The paper does not provide a direct link or an explicit statement about the availability of its source code. |
| Open Datasets | No | We randomly generated instances with different grid dimensions (between 10x10 and 100x100, with a total number of states between 200 and 160 000), time parameter values, and numbers of zones of each type (between 1 and 5 per type). ... We tested instances having from 1 computer (1246 states) to 3 computers (1 014 013 states). The paper describes generating its own instances rather than using or providing concrete access to a publicly available dataset. |
| Dataset Splits | No | The paper does not explicitly describe training, validation, and test dataset splits. |
| Hardware Specification | Yes | The experiments were run with 5.8 GB of RAM on a 2.80GHz CPU. |
| Software Dependencies | No | The paper mentions using the "PPDDL language" but does not specify any software names with version numbers or other key software dependencies. |
| Experiment Setup | Yes | In all experiments, the SPC MDP discount factor was γ = 0.9. ... For ϵoptimality we set ϵ = 0.1, with a corresponding ω = 0.001, since we fixed the penalty of a fire occurrence to -1. ... Since the reward function ranges from -50 to 50, we chose a parameter ϵ 10 for the ϵoptimality; the corresponding ω parameter is 0.001. |