reproducibilityindex.ai

Learning Diverse Policies in MOBA Games via Macro-Goals

Authors: Yiming Gao, Bei Shi, Xueying Du, Liang Wang, Guangwei Chen, Zhenjie Lian, Fuhao Qiu, GUOAN HAN, Weixuan Wang, Deheng Ye, Qiang Fu, Wei Yang, Lanxiao Huang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on the typical MOBA game Honor of Kings demonstrate that MGG can execute diverse policies in different matches and lineups, and also outperform the state-of-the-art methods over 102 heroes.
Researcher Affiliation	Industry	1Tencent AI Lab, Shenzhen, China 2Tencent Ti Mi L1 Studio, Chengdu, China {yatminggao,beishi,sherinedu,enginewang,gorvinchen, leolian,frankfhqiu,guoanhan,waihinwang,dericye, leonfu,willyang,jackiehuang}@tencent.com
Pseudocode	No	The paper includes figures illustrating the framework and network architecture (Figure 1, 4, 5) but does not contain any formal pseudocode or algorithm blocks with structured, code-like steps.
Open Source Code	No	The paper does not include an unambiguous statement about releasing source code for the methodology, nor does it provide a direct link to a code repository.
Open Datasets	No	We construct a training dataset by collecting replays from the top 1% human players to train the Meta-Controller. The paper does not provide a link, DOI, or specific repository name for this custom-collected dataset, nor does it cite a published paper that contains the dataset with proper bibliographic information.
Dataset Splits	No	The paper mentions 'training dataset' and training for a certain number of hours, but it does not specify any explicit training/validation/test dataset splits, percentages, or absolute sample counts for data partitioning.
Hardware Specification	Yes	We use 8 NVIDIA P40 GPUs for about 26 hours of training, and the batch size of each GPU is set to 512. MGG and other RL methods adopt self-play training and train by randomly selecting heroes over a physical computer cluster with 60,000 CPUs and 830 NVIDIA V100 GPUS.
Software Dependencies	No	The paper mentions using 'Adam Kingma and Ba [2014]' as the optimizer but does not specify any other software components (e.g., programming languages, libraries, frameworks) with specific version numbers.
Experiment Setup	Yes	We set α = 0.75, γ = 2 for focal loss LF L Lin et al. [2017], and set λ = 1 for the weight of auxiliary task. We use Adam with the initial learning rate of 0.0001. ...the batch size of each GPU is set to 512...The batch size of each GPU is set to 4096. ...the delta C is 30 seconds and the noise ϵ is 3 seconds.