Online POMDP Planning with Anytime Deterministic Guarantees
Authors: Moran Barenboim, Vadim Indelman
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present the experimental results obtained by integrating deterministic bounds into a state-of-the-art algorithms namely, AR-DESPOT Somani et al. [2013] and POMCP Silver and Veness [2010] as a baseline. The primary objective of these experiments is to demonstrate the validity of our derived bounds, as presented in Theorem 2, and the corresponding algorithm outlined in Algorithm 1. |
| Researcher Affiliation | Collaboration | Moran Barenboim Technion Autonomous Systems Program (TASP) Technion Israel Institute of Technology NVIDIA moranbar@campus.technion.ac.il Vadim Indelman Department of Aerospace Engineering Technion Israel Institute of Technology vadim.indelman@technion.ac.il |
| Pseudocode | Yes | Algorithm 1 ALGORITHM-A: |
| Open Source Code | Yes | Finally, we provide our algorithm implementation in https://github.com/moranbar/Online-POMDP-Planning-with-Anytime-Deterministic-Guarantees. |
| Open Datasets | Yes | For these experiments, we focused on a toy example, Tiger POMDP Kaelbling et al. [1998]. ... We evaluated the performance of both algorithms on different POMDPs, including the Tiger POMDP, Discrete Light Dark Sunberg and Kochenderfer [2018] and Baby POMDP. The corresponding results are summarized in Table 1. ... In the Laser Tag problem, Somani et al. [2013], an agent has to navigate through a grid world... |
| Dataset Splits | No | The paper describes experiments conducted through '100 simulations' or '100 runs' of the POMDP scenarios, but it does not specify explicit training, validation, or test dataset splits in the conventional sense for model training or evaluation on a pre-existing dataset. |
| Hardware Specification | Yes | The experiments were conducted on a computing platform consisting of an Intel(R) Core(TM) i7-7700 processor with 8 CPUs operating at 3.60GHz and 15.6 GHz. |
| Software Dependencies | No | The paper states 'The implementation of our algorithm was carried out using the Julia programming language and evaluated through the Julia POMDPs package...' but does not specify version numbers for Julia or the POMDPs package. |
| Experiment Setup | Yes | The selection of hyper-parameters for the POMCP and AR-DESPOT solvers, and further details about the POMDPs used for our experiments are detailed in the appendix. ... the hyperparameter K was varied across {10, 50, 500, 5000}, while λ was evaluated at {0, 0.01, 0.1}. Similarly, DB-POMCP and POMCP were examined three different values for the exploration-exploitation weight, c = {0.1, 1.0, 10.0} multiplied by Vmax, which denotes an upper bound for the value function. For the initialization of the upper and lower bounds used by the algorithms, we used the maximal reward, multiplied by the remaining time steps of the episode, Rmax (T t 1). |