Online POMDP Planning with Anytime Deterministic Guarantees

Authors: Moran Barenboim, Vadim Indelman

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present the experimental results obtained by integrating deterministic bounds into a state-of-the-art algorithms namely, AR-DESPOT Somani et al. [2013] and POMCP Silver and Veness [2010] as a baseline. The primary objective of these experiments is to demonstrate the validity of our derived bounds, as presented in Theorem 2, and the corresponding algorithm outlined in Algorithm 1.
Researcher Affiliation Collaboration Moran Barenboim Technion Autonomous Systems Program (TASP) Technion Israel Institute of Technology NVIDIA moranbar@campus.technion.ac.il Vadim Indelman Department of Aerospace Engineering Technion Israel Institute of Technology vadim.indelman@technion.ac.il
Pseudocode Yes Algorithm 1 ALGORITHM-A:
Open Source Code Yes Finally, we provide our algorithm implementation in https://github.com/moranbar/Online-POMDP-Planning-with-Anytime-Deterministic-Guarantees.
Open Datasets Yes For these experiments, we focused on a toy example, Tiger POMDP Kaelbling et al. [1998]. ... We evaluated the performance of both algorithms on different POMDPs, including the Tiger POMDP, Discrete Light Dark Sunberg and Kochenderfer [2018] and Baby POMDP. The corresponding results are summarized in Table 1. ... In the Laser Tag problem, Somani et al. [2013], an agent has to navigate through a grid world...
Dataset Splits No The paper describes experiments conducted through '100 simulations' or '100 runs' of the POMDP scenarios, but it does not specify explicit training, validation, or test dataset splits in the conventional sense for model training or evaluation on a pre-existing dataset.
Hardware Specification Yes The experiments were conducted on a computing platform consisting of an Intel(R) Core(TM) i7-7700 processor with 8 CPUs operating at 3.60GHz and 15.6 GHz.
Software Dependencies No The paper states 'The implementation of our algorithm was carried out using the Julia programming language and evaluated through the Julia POMDPs package...' but does not specify version numbers for Julia or the POMDPs package.
Experiment Setup Yes The selection of hyper-parameters for the POMCP and AR-DESPOT solvers, and further details about the POMDPs used for our experiments are detailed in the appendix. ... the hyperparameter K was varied across {10, 50, 500, 5000}, while λ was evaluated at {0, 0.01, 0.1}. Similarly, DB-POMCP and POMCP were examined three different values for the exploration-exploitation weight, c = {0.1, 1.0, 10.0} multiplied by Vmax, which denotes an upper bound for the value function. For the initialization of the upper and lower bounds used by the algorithms, we used the maximal reward, multiplied by the remaining time steps of the episode, Rmax (T t 1).