Generating Behavior-Diverse Game AIs with Evolutionary Multi-Objective Deep Reinforcement Learning
Authors: Ruimin Shen, Yan Zheng, Jianye Hao, Zhaopeng Meng, Yingfeng Chen, Changjie Fan, Yang Liu
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluations on the Atari and real commercial games indicate that, compared to existing algorithms, EMOGI performs better in generating diverse behaviors and significantly improves the efficiency of Game AI design. and This section presents empirical results on an Atari game pong and a commercial game Justice Online (JO). |
| Researcher Affiliation | Collaboration | Ruimin Shen1, , Yan Zheng2,3, , Jianye Hao2,4,5, , Zhaopeng Meng2 , Yingfeng Chen1, , Changjie Fan1 and Yang Liu3,6 1Fuxi Lab, Net Ease 2College of Intelligence and Computing, Tianjin University 3Nanyang Technological University, Singapore 4Noah s Ark Lab, Huawei 5Tianjin Key Lab of Machine Learning 6Institute of Computing Innovation, Zhejiang University |
| Pseudocode | Yes | Algorithm 1: EMOGI and Algorithm 2: Diverse-Select |
| Open Source Code | No | The paper provides links to supplementary details and videos but does not include an explicit statement or link to the open-source code for the methodology described. |
| Open Datasets | No | The paper mentions using 'Atari game pong' and 'commercial game Justice Online (JO)' for evaluation but does not provide concrete access information (link, DOI, or specific citation to a dataset repository) for these game environments or any specific datasets derived from them. |
| Dataset Splits | No | The paper does not provide specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit standard split references). |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU models, CPU types, or memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions methods like A3C and DRL, but it does not specify any software dependencies with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x, Python 3.x). |
| Experiment Setup | No | The paper describes the reward function setup and mentions that 'all baselines use the same hyper-parameters deļ¬ned in [Mnih et al., 2016]' for A3C, referring to an external paper for hyperparameter details rather than providing them directly within the text. |