Pruning Game Tree by Rollouts
Authors: Bojun Huang
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper we show that the α-β algorithm and its successor MT-SSS*, as two classic minimax search algorithms, can be implemented as rollout algorithms, a generic algorithmic paradigm widely used in many domains. Specifically, we define a family of rollout algorithms, in which the rollout policy is restricted to select successor nodes only from a subset of the children list. We show that any rollout policy in this family (either deterministic or randomized) is guaranteed to evaluate the game tree correctly with a finite number of rollouts. Moreover, we identify simple rollout policies in this family that implement α-β and MT-SSS*. |
| Researcher Affiliation | Industry | Bojun Huang Microsoft Research bojhuang@microsoft.com |
| Pseudocode | Yes | Algorithm 1: The α-β algorithm enhanced with storage.; Algorithm 2: The MT-SSS* algorithm.; Algorithm 3: A family of rollout algorithms.; Algorithm 4: A variant of the alphabeta procedure, which returns a pair of value bounds. |
| Open Source Code | No | The paper does not provide a specific link to source code for its methodology or explicitly state that the code is publicly available. It only links to the paper itself in the references. |
| Open Datasets | No | The paper is theoretical and does not describe empirical experiments involving datasets. |
| Dataset Splits | No | The paper is theoretical and does not describe empirical experiments with dataset splits for training, validation, and testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any empirical experiments that would require hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not describe any experimental setup that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with hyperparameters or training configurations. |