Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Spending Thinking Time Wisely: Accelerating MCTS with Virtual Expansions
Authors: Weirui Ye, Pieter Abbeel, Yang Gao
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our method can achieve comparable performances to the original search algorithm while requiring less than 50% search time on average. We believe that this approach is a viable alternative for tasks under limited time and resources. |
| Researcher Affiliation | Collaboration | Weirui Ye Pieter Abbeel Yang Gao Tsinghua University, UC Berkeley, Shanghai Qi Zhi Institute |
| Pseudocode | Yes | Algorithm 1 Iteration of vanilla MCTS, Algorithm 2 Iteration of MCTS with Virtual Expansion, Algorithm 3 Virtual MCTS |
| Open Source Code | Yes | The code is available at https://github.com/Ye WR/V-MCTS.git. |
| Open Datasets | Yes | The environment of Go is built based on an open-source codebase, Gym Go [19]. We evaluate the performance of the agent against GNU Go v3.8 at level 10 [5] for 200 games. ... As for the Atari games, we choose 5 games with 100k environment steps. |
| Dataset Splits | No | The paper describes evaluation procedures against GNU Go and using evaluation seeds for Atari games, but it does not specify traditional dataset validation splits (e.g., percentages or counts for a separate validation set). |
| Hardware Specification | Yes | Recently, Ye et al. [34] proposed Efficient Zero, a variant of Mu Zero [27] with three extra components to improve the sample efficiency, which only requires 8 GPUs in training, and thus it is more affordable. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies (e.g., Python, deep learning frameworks like PyTorch or TensorFlow, or other libraries). |
| Experiment Setup | Yes | Hyper-parameters As for the Go 9 9, we choose Tromp-Taylor rules. The environment of Go is built based on an open-source codebase, Gym Go [19]. We evaluate the performance of the agent against GNU Go v3.8 at level 10 [5] for 200 games. ... We set the komi to 6.5... As for the Atari games, we choose 5 games with 100k environment steps. In each setting, we use 3 training seeds and 100 evaluation seeds for each trained model. ... The default values of r, are set to 0.2, 0.1. |