Three-Head Neural Network Architecture for Monte Carlo Tree Search

Authors: Chao Gao, Martin Müller, Ryan Hayward

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present experimental results on 13 13 Hex, the largest board size that has been adopted in computer program competitions.
Researcher Affiliation Academia Chao Gao, Martin M uller, Ryan Hayward University of Alberta {cgao3, mmueller, hayward}@ualberta.ca
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The trained neural net models of Mo Hex-CNN and dataset are publicly available https://drive.google.com/drive/folders/ 18Mdnv MIt U7O2s EJDlbmk Zz Uh ZG7y DK9.
Open Datasets Yes Dataset We use the publicly available training dataset of Mo Hex CNN [Gao et al., 2017], generated from Mo Hex 2.0 selfplay1, containing about 106 distinct state-action-value examples. Each game is an alternating sequence of black and white moves, along with game result. 1The trained neural net models of Mo Hex-CNN and dataset are publicly available https://drive.google.com/drive/folders/ 18Mdnv MIt U7O2s EJDlbmk Zz Uh ZG7y DK9.
Dataset Splits No As in [Gao et al., 2017], the dataset is partitioned into training and testing sets, where examples from testing set do not appear in the training set.
Hardware Specification Yes We execute experiments on the same Intel i7-6700 CPU computer with a single GTX 1080 GPU and 32 GB RAM.
Software Dependencies No The neural nets are implemented with Tensorflow, trained by Adam optimizer [Kingma and Ba, 2014] using default learning rate with mini-batch size of 128 for 100 epochs.
Experiment Setup Yes The neural nets are implemented with Tensorflow, trained by Adam optimizer [Kingma and Ba, 2014] using default learning rate with mini-batch size of 128 for 100 epochs. For loss function (5), we set L2 regularization constant c to 10 5, and value loss weight w to 0.01.