Reference-Based POMDPs
Authors: Edward Kim, Yohan Karunanayake, Hanna Kurniawati
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on long-horizon 2D and 3D navigation problems are presented in Section 6 and indicate that our solver can be employed to substantially outperform POMCP [21]. |
| Researcher Affiliation | Academia | Edward Kim School of Computing Australian National University Canberra, Australia edward.kim@anu.edu.au Yohan Karunanayake School of Computing Australian National University Canberra, Australia yohan.karunanayake@anu.edu.au Hanna Kurniawati School of Computing Australian National University Canberra, Australia hanna.kurniawati@anu.edu.au |
| Pseudocode | Yes | Pseudo code for an implemented algorithm where the reference policy is constructed using a fully-observed policy πFO is presented in the Appendix Section A.6 Algorithm 1. |
| Open Source Code | Yes | We include the source code for REFSOLVER, which is developed on top of pomdp_py as a Supplementary Material. https://github.com/RDLLab/ref_pomdp_neurips23 |
| Open Datasets | No | The paper uses simulated environments and generated maps for its experiments (e.g., 'The environment is a 60 60 static gridworld populated...', 'we generated 64 different initial maps with randomly placed obstacles and landmarks'). It does not provide access information or citations to any publicly available or open datasets. |
| Dataset Splits | No | The paper conducts experiments in simulated environments and does not specify training, validation, or test dataset splits. Performance is evaluated directly by running the proposed solver and baselines in these environments. |
| Hardware Specification | Yes | All experiments were performed on a desktop computer with an 8 Core Intel Xeon Silver 4110 Processor and 128GB DDR4 RAM. |
| Software Dependencies | No | The paper mentions using the 'pomdp_py' library, but it does not specify a version number for this or any other software dependency required for replication. |
| Experiment Setup | Yes | POMCP was run with an exploration constant of 300 and maximum depth of 180 this maximum depth is the upper bound for (tree depth + rollout steps). For REFSOLVER, the maximum tree depth was 90, the maximum rollout depth was 180 and α = 0.5. |