reproducibilityindex.ai

Spending Thinking Time Wisely: Accelerating MCTS with Virtual Expansions

Authors: Weirui Ye, Pieter Abbeel, Yang Gao

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that our method can achieve comparable performances to the original search algorithm while requiring less than 50% search time on average. We believe that this approach is a viable alternative for tasks under limited time and resources.
Researcher Affiliation	Collaboration	Weirui Ye Pieter Abbeel Yang Gao Tsinghua University, UC Berkeley, Shanghai Qi Zhi Institute
Pseudocode	Yes	Algorithm 1 Iteration of vanilla MCTS, Algorithm 2 Iteration of MCTS with Virtual Expansion, Algorithm 3 Virtual MCTS
Open Source Code	Yes	The code is available at https://github.com/Ye WR/V-MCTS.git.
Open Datasets	Yes	The environment of Go is built based on an open-source codebase, Gym Go [19]. We evaluate the performance of the agent against GNU Go v3.8 at level 10 [5] for 200 games. ... As for the Atari games, we choose 5 games with 100k environment steps.
Dataset Splits	No	The paper describes evaluation procedures against GNU Go and using evaluation seeds for Atari games, but it does not specify traditional dataset validation splits (e.g., percentages or counts for a separate validation set).
Hardware Specification	Yes	Recently, Ye et al. [34] proposed Efﬁcient Zero, a variant of Mu Zero [27] with three extra components to improve the sample efﬁciency, which only requires 8 GPUs in training, and thus it is more affordable.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies (e.g., Python, deep learning frameworks like PyTorch or TensorFlow, or other libraries).
Experiment Setup	Yes	Hyper-parameters As for the Go 9 9, we choose Tromp-Taylor rules. The environment of Go is built based on an open-source codebase, Gym Go [19]. We evaluate the performance of the agent against GNU Go v3.8 at level 10 [5] for 200 games. ... We set the komi to 6.5... As for the Atari games, we choose 5 games with 100k environment steps. In each setting, we use 3 training seeds and 100 evaluation seeds for each trained model. ... The default values of r, are set to 0.2, 0.1.