Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Markovian State and Action Abstractions for MDPs via Hierarchical MCTS

Authors: Aijun Bai, Siddharth Srivastava, Stuart Russell

IJCAI 2016 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments, Figures 3a, 3e, 3b and 3f with x axis in log scale show the results of running UCT, UCT', POMCP(M, '), POMCP(M, ', O) and smart-POMCP(M, ', O) in ROOMS[17, 17, 4] and ROOMS[25, 13, 8] problems
Researcher Affiliation Collaboration Aijun Bai UC Berkeley EMAIL Siddharth Srivastava United Tech. Research Center EMAIL Stuart Russell UC Berkeley EMAIL
Pseudocode Yes Figure 2: POMCP(M, ', O) Markovian state and action abstractions for MDPs via hierarchical MCTS. (This figure contains the pseudocode blocks).
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets No The paper uses custom problem domains ('ROOMS[m, n, k]' and 'C-ROOMS[m, n, k]') which are described, but no concrete access information (link, DOI, repository, or citation to an established public dataset) is provided.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers.
Experiment Setup Yes The discount factor is γ = 0.98. The maximal planning horizon is determined as H = blogγ c = 341, where is set to be 0.001.