Bayesian Optimized Monte Carlo Planning

Authors: John Mern, Anil Yildiz, Zachary Sunberg, Tapan Mukerji, Mykel J. Kochenderfer11880-11887

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate the effectiveness of BOMCP, we conducted a series of experiments on three distinct POMDPs. We evaluated the performance of BOMCP against the performance of POMCPOW and expert policies for each problem. For each experiment, we recorded the task score as well as the wall clock run time per-search to measure the computation cost.
Researcher Affiliation Academia John Mern,1 Anil Yildiz,1 Zachary Sunberg,2 Tapan Mukerji,3 and Mykel J. Kochenderfer1 1Stanford University, Department of Aeronautics and Astronautics, 496 Lomita Mall, Stanford, CA 94305 2University of Colorado Boulder, Department of Aerospace Engineering Sciences, 3775 Discovery Drive, Boulder, CO 80303 3Stanford University, Department of Energy Resources Engineering, 367 Panama Street, Stanford, CA 94305
Pseudocode Yes Algorithm 1 Plan, Algorithm 2 Simulate, Algorithm 3 Bayesian Optimization
Open Source Code Yes Source code for BOMCP is available at https://github.com/sisl/BOMCP.jl.
Open Datasets Yes We used data from the Global Wind Atlas at the Altamont Pass wind farm, which covers an area of approximately 392 km2.
Dataset Splits No The paper does not provide specific details on training, validation, and test splits for the datasets or simulation environments used.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper states, 'We implemented BOMCP and BOMCTS in Julia building upon the POMDPs.jl package (Egorov et al. 2017),' but does not provide specific version numbers for Julia or POMDPs.jl.
Experiment Setup Yes For all tests, the same values were used for hyper-parameters shared by BOMCP and POMCPOW such as Kaction and αaction. The initial vehicle state is sampled from a multivariate Gaussian with mean µ = (x = 0, y = 50, θ = 0, x = 0, y = 10, ω = 0). The action space is a three-dimensional continuous space defined by the tuple (T, Fx, δ). T is the main thrust which is in the range [0, 15]. Fx is the corrective thrust, which is in the range [ 5, 5] and δ [ 1, 1].