Optimal Cost Almost-Sure Reachability in POMDPs
Authors: Krishnendu Chatterjee, Martin Chmelik, Raghav Gupta, Ayush Kanodia
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experimental Results We have implemented Algorithm 1: our algorithm first implements σAllow computation; and for the finite-horizon value iteration (Step 3 of Algorithm 1) we implement two approaches. The first is the exact finite-horizon value iteration using a modified version of POMDP-Solve (Cassandra 2005); and the second is an approximate finite-horizon value iteration using a modified version of RTDP-Bel (Bonet and Geffner 2009); and in both cases our straightforward modification is that the computation of the finite-horizon value iteration is restricted to allowed actions and almost-sure winning beliefs. We experimented on several well-known examples of POMDPs. ... Table 1: Experimental results |
| Researcher Affiliation | Academia | Krishnendu Chatterjee IST Austria Klosterneuburg, Austria kchatterjee@ist.ac.at Martin Chmel ık IST Austria Klosterneuburg, Austria mchmelik@ist.ac.at Raghav Gupta IIT Bombay India raghavgupta93@gmail.com Ayush Kanodia IIT Bombay India kanodiaayush@gmail.com |
| Pseudocode | Yes | Algorithm 1 APPROXALGO |
| Open Source Code | No | The paper states 'We have implemented our approximation algorithms developing on the existing implementations for finite-horizon objectives, and present experimental results on a number of well-known examples.' but does not provide a link or explicit statement about making their specific implementation open-source. |
| Open Datasets | Yes | The POMDP examples we considered are as follows: (A) We experimented with the Cheese maze POMDP example which was studied in (Mc Callum 1992; Dutech 2000; Littman, Cassandra, and Kaelbling 1995; Mc Cracken and Bowling 2005). (B) We considered the Grid POMDP studied in (Russell et al. 1995; Littman, Cassandra, and Kaelbling 1995; Parr and Russell 1995; Mc Cracken and Bowling 2005). (C) We experimented with the robot navigation problem POMDP introduced in (Littman, Cassandra, and Kaelbling 1995)... (D) We consider the Hallway example from (Littman, Cassandra, and Kaelbling 1995; Spaan 2004; Smith and Simmons 2004; Bonet and Geffner 2009). (E) We consider the Rock Sample example from (Bonet and Geffner 2009; Smith and Simmons 2004). |
| Dataset Splits | No | The paper uses well-known POMDP examples but does not describe any specific training, validation, or test dataset splits for its experiments, as the experiments evaluate an algorithm on predefined POMDP models rather than learning from data with such splits. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | Yes | We implement two approaches. The first is the exact finite-horizon value iteration using a modified version of POMDP-Solve (Cassandra 2005); and the second is an approximate finite-horizon value iteration using a modified version of RTDP-Bel (Bonet and Geffner 2009); and in both cases our straightforward modification is that the computation of the finite-horizon value iteration is restricted to allowed actions and almost-sure winning beliefs. ... Cassandra, A. 2005. Pomdp-solve [software, version 5.3]. http://www.pomdp.org/. |
| Experiment Setup | Yes | For the exact computation, we consider multiplicative approximation with ϵ = 0.1 and report the number of iterations, the time required by the exact computation, and the computed value. For the approximate computation, we report the time required by the number of trials specified for the computation of the finite-horizon value iteration and the computed value. ... Table 1: Experimental results (shows values like 'ϵ = 0.1' and 'Trials 12k'). |