Monte-Carlo Tree Search for Constrained POMDPs

Authors: Jongmin Lee, Geon-hyeong Kim, Pascal Poupart, Kee-Eung Kim

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the experiments, we demonstrate that CC-POMCP converges to the optimal stochastic action selection in CPOMDP and pushes the state-of-the-art by being able to scale to very large problems.
Researcher Affiliation Collaboration Jongmin Lee1, Geon-Hyeong Kim1, Pascal Poupart2, Kee-Eung Kim1,3 1 School of Computing, KAIST, Republic of Korea 2 University of Waterloo, Waterloo AI Institute and Vector Institute 3 PROWLER.io
Pseudocode Yes Algorithm 1 Cost-Constrained POMCP (CC-POMCP)
Open Source Code No The paper does not provide any explicit statement about releasing its source code or a link to a code repository.
Open Datasets Yes We first tested CC-POMCP on the synthetic toy domain introduced in [11] to demonstrate convergence to stochastic optimal actions, where the cost constraint ˆc is 0.95. We also conducted experiments on a multi-objective version of PONG, an arcade game running on the Arcade Learning Environment (ALE) [3], depicted in Figure 2a.
Dataset Splits No The paper mentions using different domains (Toy, Rocksample, PONG) for experiments but does not provide specific training, validation, or test dataset splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library names with versions like Python 3.8, PyTorch 1.9).
Experiment Setup Yes All the parameters for running CC-POMCP are provided in Appendix H.