Monte-Carlo Tree Search for Constrained POMDPs
Authors: Jongmin Lee, Geon-hyeong Kim, Pascal Poupart, Kee-Eung Kim
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In the experiments, we demonstrate that CC-POMCP converges to the optimal stochastic action selection in CPOMDP and pushes the state-of-the-art by being able to scale to very large problems. |
| Researcher Affiliation | Collaboration | Jongmin Lee1, Geon-Hyeong Kim1, Pascal Poupart2, Kee-Eung Kim1,3 1 School of Computing, KAIST, Republic of Korea 2 University of Waterloo, Waterloo AI Institute and Vector Institute 3 PROWLER.io |
| Pseudocode | Yes | Algorithm 1 Cost-Constrained POMCP (CC-POMCP) |
| Open Source Code | No | The paper does not provide any explicit statement about releasing its source code or a link to a code repository. |
| Open Datasets | Yes | We first tested CC-POMCP on the synthetic toy domain introduced in [11] to demonstrate convergence to stochastic optimal actions, where the cost constraint ˆc is 0.95. We also conducted experiments on a multi-objective version of PONG, an arcade game running on the Arcade Learning Environment (ALE) [3], depicted in Figure 2a. |
| Dataset Splits | No | The paper mentions using different domains (Toy, Rocksample, PONG) for experiments but does not provide specific training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library names with versions like Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | All the parameters for running CC-POMCP are provided in Appendix H. |