reproducibilityindex.ai

State Aggregation in Monte Carlo Tree Search

Authors: Jesse Hostetler, Alan Fern, Tom Dietterich

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	As a proof of concept, we experimentally conﬁrm that state aggregation can improve the ﬁnite-sample performance of UCT. This section presents a small experiment that demonstrates the sample complexity beneﬁts of abstraction.
Researcher Affiliation	Academia	Jesse Hostetler and Alan Fern and Tom Dietterich Department of Electrical Engineering and Computer Science Oregon State University {hostetje, afern, tgd}@eecs.oregonstate.edu
Pseudocode	No	The paper describes algorithms and their modifications in paragraph form but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link indicating that its source code is publicly available.
Open Datasets	No	Our experimental domain is a version of the card game Blackjack. We play to a maximum score of 32, instead of 21 for ordinary Blackjack. This makes the planning horizon longer, which allows abstraction to have a larger effect. We draw from an inﬁnite deck so that card counting is not helpful, and we do not allow doubling down, splitting pairs, or surrendering. The paper describes a custom experimental domain based on Blackjack but does not provide any concrete access information (link, DOI, citation) to a publicly available dataset.
Dataset Splits	No	The paper mentions running experiments for "varying sample limits" and measuring "average return over 10^5 games" but does not provide specific details on training, validation, or test splits of any dataset.
Hardware Specification	No	No specific hardware details are mentioned in the paper.
Software Dependencies	No	No specific software dependencies with version numbers are mentioned in the paper.
Experiment Setup	No	We ran χ-UCT with the four representations for varying sample limits. The performance measure is the average return over 10^5 games. While the paper describes the experimental task (Blackjack variation) and the number of games played, it does not provide specific hyperparameters, optimizer settings, or detailed training configurations (e.g., learning rates, batch sizes, epochs) as required for a "Yes" answer.