Verifiable RNN-Based Policies for POMDPs Under Temporal Logic Constraints
Authors: Steven Carr, Nils Jansen, Ufuk Topcu
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The numerical experiments show that the proposed approach outperforms traditional POMDP synthesis methods by 3 orders of magnitude within 2% of optimal benchmark values. |
| Researcher Affiliation | Academia | Steven Carr1 , Nils Jansen2 and Ufuk Topcu1 1The University of Texas at Austin 2Radboud University, Nijmegen, The Netherlands |
| Pseudocode | No | The paper includes a procedural flow chart (Figure 2) but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states 'We provide a Python toolchain that employs the probabilistic model checker Storm [Dehnert et al., 2017].', but it does not include a specific link to the code for the methodology described in the paper, nor does it explicitly state that the code is being released publicly (e.g., via a repository or supplementary materials). |
| Open Datasets | Yes | We analyze the method on three different settings: Maze(c), Grid(c) and Navigation(c), for detailed descriptions of these examples see [Norman et al., 2017] and [Carr et al., 2019], respectively. |
| Dataset Splits | No | The paper mentions training and retraining RNNs but does not specify explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) used in their experiments. |
| Hardware Specification | Yes | All experiments are run on a 2.3 GHz machine with a 12 GB memory limit and a max computation time of 105 seconds. |
| Software Dependencies | No | The paper mentions using 'Python', the 'probabilistic model checker Storm', and the 'deep learning library Keras' but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | For proper comparison with the synthesis tools we adopt the experiment setup of [Carr et al., 2019], whereby we always iterate for 10 instances of retraining the RNN from counterexample data. ... in this trivial example is given by H(Crit M A1) = p log2(p) (1 p) log2(1 p) = 0 from (2). The average entropy is below a prescribed threshold, η = 0.5, and thus we increase the number of memory nodes, which results in the satisfying FSC A2 in Fig. 3c. |