reproducibilityindex.ai

Sparse Tree Search Optimality Guarantees in POMDPs with Continuous Observation Spaces

Authors: Michael H. Lim, Claire Tomlin, Zachary N. Sunberg

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The simple numerical experiments in this section conﬁrm the theoretical results of Section 6. Speciﬁcally, they show that the value function estimates of POSS converge to the QMDP approximation and the value function estimates of POWSS converge to the optimal value function for a toy problem. ... The results plotted in Fig. 2 show the Q-value estimates of POWSS converging toward the optimal Q-values as the width C is increased.
Researcher Affiliation	Academia	Michael H. Lim1 , Claire J. Tomlin1 and Zachary N. Sunberg2 1University of California, Berkeley 2University of Colorado, Boulder
Pseudocode	Yes	Algorithm 1 Routines common to POWSS and POSS... Algorithm 2 POSS Algorithm: Estimate Q... Algorithm 3 POWSS Algorithm: Estimate Q...
Open Source Code	Yes	Both POWSS and POSS were implemented using the POMDPs.jl framework, [Egorov et al., 2017] and open-source code can be found at https://github.com/Julia POMDP/ Sparse Sampling.jl.
Open Datasets	No	We consider a simple modiﬁcation of the classic tiger problem [Kaelbling et al., 1998] that we refer to as the continuous observation tiger (CO-tiger) problem.
Dataset Splits	No	The discount is 0.95, and the terminal depth is 3.
Hardware Specification	No	At C = 41, POMCPOW is about an order of magnitude faster than POWSS.
Software Dependencies	No	Both POWSS and POSS were implemented using the POMDPs.jl framework, [Egorov et al., 2017] and open-source code can be found at https://github.com/Julia POMDP/ Sparse Sampling.jl.
Experiment Setup	Yes	The discount is 0.95, and the terminal depth is 3. ... Each data point represents the mean Q-value from 200 runs of the algorithm from a uniformly-distributed belief, with the standard deviation plotted as a ribbon. ... For these tests, the double progressive widening parameters ko = C, αo = 0 were used to limit the tree width, with n = C3 iterations to keep the particle density high in wider trees (see Sunberg and Kochenderfer [2018] for parameter deﬁnitions).