Monte Carlo Tree Search in Continuous Action Spaces with Execution Uncertainty

Authors: Timothy Yee, Viliam Lisy, Michael Bowling

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In a high fidelity simulator of the Olympic sport of curling, we show that this approach significantly outperforms existing MCTS methods. We evaluate KR-UCT in a high fidelity simulation of the Olympic sport of curling.
Researcher Affiliation Academia Timothy Yee, Viliam Lis y, Michael Bowling Department of Computing Science University of Alberta Edmonton, AB, Canada T6G 2E8 {tayee, lisy, bowling}@ualberta.ca
Pseudocode Yes Algorithm 1 Kernel Regression UCT
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets No The paper describes using a 'high fidelity simulator of the Olympic sport of curling' and models based on 'Olympic-level players' and 'curling percentage statistics of men from the 2010 and 2014 Olympic games' to fit parameters. However, it does not provide a specific link, DOI, or formal citation to a publicly accessible dataset used for training or evaluation.
Dataset Splits No The paper describes experimental evaluation over simulated games, mentioning '1600 samples' and '16000 one-end games', but it does not specify explicit training, validation, and test dataset splits with percentages or sample counts.
Hardware Specification No The acknowledgements section mentions 'The computational resources were made possible by Compute Canada and Calcul Qu ebec.' This is a general statement about resources and does not provide specific hardware details (e.g., specific GPU/CPU models, memory specifications) used for the experiments.
Software Dependencies No The paper states, 'The curling simulator used in this paper is implemented using the Chipmunk 2D rigid body physics library'. However, it does not provide a version number for this library or any other software dependencies with specific version information, which is necessary for reproducibility.
Experiment Setup Yes All algorithms used 1600 samples and evaluated the final shot selection with a lower confidence bound estimate (CLCB = 0.001). For each algorithm, we ran a round robin tournament to identify a good UCB constant from the set {0.01, 0.1, 1.0, 10, 100}. For all algorithms, CUCB = 1.0 was the best constant. For the weighting in RAVE, we did a similar round robin tournament to select the β parameter from the set {0.01, 0.1, 1.0, 10.0, 100.0}, and found β = 1.0 to be the best for RAVE and RAVE+PW. For KR-UCT, we defined = 0.02 and k = 10.