Practical Massively Parallel Monte-Carlo Tree Search Applied to Molecular Design
Authors: Xiufeng Yang, Tanuj Aasawat, Kazuki Yoshizoe
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper proposes a novel massively parallel Monte-Carlo Tree Search (MP-MCTS) algorithm that works efficiently for a 1,000 worker scale on a distributed memory environment using multiple compute nodes and applies it to molecular design. This paper is the first work that applies distributed MCTS to a real-world and non-game problem. Our experimental results show that a simple RNN model combined with massively parallel MCTS outperforms existing work using more complex models combined with Bayesian Optimization or Reinforcement Learning (other than UCT). |
| Researcher Affiliation | Collaboration | Xiufeng Yang Chugai Pharmaceutical Co., Ltd yangxiufengsia@gmail.com Tanuj Kr Aasawat Parallel Computing Lab India, Intel Labs tanuj.aasawat@intel.com Kazuki Yoshizoe RIKEN Center for Advanced Intelligence Project yoshizoe@acm.org |
| Pseudocode | Yes | D PSEUDO CODE |
| Open Source Code | Yes | Code is available at https://github.com/yoshizoe/mp-chemts |
| Open Datasets | Yes | The model was pre-trained using a molecule dataset that contains 250K drug molecules extracted from the ZINC database, following (Kusner et al., 2017; Yang et al., 2017; Jin et al., 2018). |
| Dataset Splits | No | The paper mentions that the GRU model was pre-trained using a molecule dataset extracted from the ZINC database, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) for this pre-training or for the MCTS experiments themselves. |
| Hardware Specification | Yes | All experiments, unless otherwise specified, were run for 10 minutes on up to 1024 cores of a CPU cluster (each node equipped with two Intel Xeon Gold 6148 CPU (2.4GHz, 20 cores) and 384GB of memory), and one MPI process (called worker in this paper) is assigned to one core. |
| Software Dependencies | No | The paper states that 'Algorithms are implemented using Keras with Tensor Flow and MPI library for Python (mpi4py)', but it does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | Input data represents SMILES symbols using 64-dim one-hot vectors. The first GRU layer has 64-dim input/256-dim output. The second GRU layer has 256-dim input/256-dim output, connected to the last dense layer, which outputs 64 values with softmax. In the expansion step, we add branches (e.g., SMILES symbols) with high probability until the cumulative probability reaches 0.95. We use the reward definition described in Yang et al. (Yang et al., 2017) which is normalized to [ 1, 1] and consider the same value for exploration constant, C = 1. All experiments, unless otherwise specified, were run for 10 minutes. |