reproducibilityindex.ai

Monte Carlo Tree Search with Boltzmann Exploration

Authors: Michael Painter, Mohamed Baioumy, Nick Hawes, Bruno Lacerda

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical analysis shows that our algorithms show consistent high performance across several benchmark domains, including the game of Go.
Researcher Affiliation	Academia	Michael Painter, Mohamed Baioumy, Nick Hawes, Bruno Lacerda Oxford Robotics Institute University of Oxford {mpainter, mohamed, nickh, bruno}@robots.ox.ac.uk
Pseudocode	No	The paper describes algorithms using text and equations but does not provide structured pseudocode blocks or algorithms labeled as such.
Open Source Code	No	The paper does not provide an explicit statement or link for the open-source code of the methodology described in this paper.
Open Datasets	Yes	To validate our approach, we use the Frozen Lake environment [5], and the Sailing Problem [28], commonly used to evaluate tree search algorithms [28, 22, 33, 13]. ... We used an openly available value network V and policy network π from Kata Go [36].
Dataset Splits	No	The paper describes evaluation procedures (e.g., 'evaluated every 250 trials using 250 trajectories') but does not specify traditional training/validation/test dataset splits as it primarily uses generative simulation environments rather than static datasets.
Hardware Specification	Yes	Each algorithm was limited to 5 seconds of compute time per move, allowed to use 32 search threads per move, and had access to 80 Intel Xeon E5-2698V4 CPUs clocked at 2.2GHz, and a single Nvidia V100 GPU on a shared compute cluster.
Software Dependencies	No	The paper mentions software like 'OpenAI Gym' [5] and 'Kata Go' [36], but it does not specify version numbers for these or any other software dependencies crucial for replication.
Experiment Setup	Yes	A horizon of 100 was used for Frozen Lake and 50 for the Sailing Problem. Parameters were selected using a hyper-parameter search (Appendix D.3). ... Each algorithm was limited to 5 seconds of compute time per move, allowed to use 32 search threads per move