Data-Efficient Reinforcement Learning for Malaria Control

Authors: Lixin Zou

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental extensive experimental results verify its advantage over the state-of-the-art on the challenging malaria control task.
Researcher Affiliation Industry Lixin Zou Baidu Inc. zoulixin15@gmail.com
Pseudocode Yes Algorithm 1: Variance-Bonus Monte Carlo Tree Search
Open Source Code Yes Our implementation and all baseline codes are available at https://github.com/zoulixin93/ VB MCTS.
Open Datasets Yes extensive experiments on two different Open Malaria [Smith et al., 2008] based simulators: Seq Dec Challenge and Prove Challenge... These simulating environments are available at https://github.com/IBM/ushiriki-policy-engine-library.
Dataset Splits Yes 5-fold cross-validation is performed during the updates of the GP world model. Particularly, we use 1-fold for training and 4-fold for validation, which ensures the generalizability of the GP world model over different state-action pairs.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used in the experiments.
Experiment Setup Yes We build a feature map of 14-dimension for this task... we empirically set the sum of exploration/exploitation parameters as β1 + β2 = 3.5 in the experiments. In MCTS, cpuct is 5, and only the top 50 rewarded child nodes are expanded. The number of iterations does not exceed 100,000. For Gaussian Process, to avoid overfitting problems, 5-fold cross-validation is performed during the updates of the GP world model. Particularly, we use 1-fold for training and 4-fold for validation...