Markovian State and Action Abstractions for MDPs via Hierarchical MCTS
Authors: Aijun Bai, Siddharth Srivastava, Stuart Russell
IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments, Figures 3a, 3e, 3b and 3f with x axis in log scale show the results of running UCT, UCT', POMCP(M, '), POMCP(M, ', O) and smart-POMCP(M, ', O) in ROOMS[17, 17, 4] and ROOMS[25, 13, 8] problems |
| Researcher Affiliation | Collaboration | Aijun Bai UC Berkeley aijunbai@berkeley.edu Siddharth Srivastava United Tech. Research Center srivass@utrc.utc.com Stuart Russell UC Berkeley russell@cs.berkeley.edu |
| Pseudocode | Yes | Figure 2: POMCP(M, ', O) Markovian state and action abstractions for MDPs via hierarchical MCTS. (This figure contains the pseudocode blocks). |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper uses custom problem domains ('ROOMS[m, n, k]' and 'C-ROOMS[m, n, k]') which are described, but no concrete access information (link, DOI, repository, or citation to an established public dataset) is provided. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers. |
| Experiment Setup | Yes | The discount factor is γ = 0.98. The maximal planning horizon is determined as H = blogγ c = 341, where is set to be 0.001. |