reproducibilityindex.ai

Learning in POMDPs with Monte Carlo Tree Search

Authors: Sammie Katt, Frans A. Oliehoek, Christopher Amato

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5. Empirical Evaluation. We conducted an empirical evaluation with aimed for 3 goals: The ﬁrst goal attempts to support the claims made in Section 4 and show that the adaptations to BA-POMCP do not decrease the quality of the resulting policies. Second, we investigate the runtime of those modiﬁcations to demonstrate their contribution to the efﬁciency of BAPOMCP. The last part contains experiments that directly compare the performance per action selection time with the baseline approach of Ross et al. (2011).
Researcher Affiliation	Academia	Sammie Katt 1 Frans A. Oliehoek 2 Christopher Amato 1 1Northeastern University, Boston, Massachusetts,USA 2University of Liverpool, UK.
Pseudocode	Yes	Algorithm 1 BA-POMCP( b,num sims), Algorithm 2 SIMULATE( s, d, h), Algorithm 3 BA-POMCP-STEP( s = s, χ , a), Algorithm 4 R-BA-POMCP-STEP ( s = s, χ , a), Algorithm 5 E-BA-POMCP-STEP( s = s, χ , a), Algorithm 6 L-BA-POMCP-STEP(sl = s, l, δ , a)
Open Source Code	No	No explicit statement providing concrete access to the source code for the methodology described in this paper was found.
Open Datasets	No	The paper describes using the 'classical Tiger problem' and a 'partially observable extension to the Sysadmin problem', which appear to be simulated environments or problem setups rather than publicly available datasets with concrete access information.
Dataset Splits	No	No specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit references to predefined splits) were provided.
Hardware Specification	No	No specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running the experiments were provided.
Software Dependencies	No	No specific software dependencies, libraries, or solvers with version numbers were mentioned in the paper.
Experiment Setup	Yes	Table 1: Default experiment parameters which lists 'Parameter Value γ 0.95 horizon (h) 20 # particles in belief 1000 exploration const h (max(R) min(R)) # episodes 100 λ = # updated counts 30' and 'for each count c, we take the true probability of that transition (called p) and (randomly) either subtract or add .15. Note that we do not allow transitions with nonzero probability to fall below 0 by setting those counts to 0.001. Each Dirichlet distribution is then normalized the counts to sum to 20.'