Pruning Game Tree by Rollouts

Authors: Bojun Huang

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper we show that the α-β algorithm and its successor MT-SSS*, as two classic minimax search algorithms, can be implemented as rollout algorithms, a generic algorithmic paradigm widely used in many domains. Specifically, we define a family of rollout algorithms, in which the rollout policy is restricted to select successor nodes only from a subset of the children list. We show that any rollout policy in this family (either deterministic or randomized) is guaranteed to evaluate the game tree correctly with a finite number of rollouts. Moreover, we identify simple rollout policies in this family that implement α-β and MT-SSS*.
Researcher Affiliation Industry Bojun Huang Microsoft Research bojhuang@microsoft.com
Pseudocode Yes Algorithm 1: The α-β algorithm enhanced with storage.; Algorithm 2: The MT-SSS* algorithm.; Algorithm 3: A family of rollout algorithms.; Algorithm 4: A variant of the alphabeta procedure, which returns a pair of value bounds.
Open Source Code No The paper does not provide a specific link to source code for its methodology or explicitly state that the code is publicly available. It only links to the paper itself in the references.
Open Datasets No The paper is theoretical and does not describe empirical experiments involving datasets.
Dataset Splits No The paper is theoretical and does not describe empirical experiments with dataset splits for training, validation, and testing.
Hardware Specification No The paper is theoretical and does not describe any empirical experiments that would require hardware specifications.
Software Dependencies No The paper is theoretical and does not describe any experimental setup that would require specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with hyperparameters or training configurations.