Generating Behavior-Diverse Game AIs with Evolutionary Multi-Objective Deep Reinforcement Learning

Authors: Ruimin Shen, Yan Zheng, Jianye Hao, Zhaopeng Meng, Yingfeng Chen, Changjie Fan, Yang Liu

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Evaluations on the Atari and real commercial games indicate that, compared to existing algorithms, EMOGI performs better in generating diverse behaviors and significantly improves the efficiency of Game AI design. and This section presents empirical results on an Atari game pong and a commercial game Justice Online (JO).
Researcher Affiliation Collaboration Ruimin Shen1, , Yan Zheng2,3, , Jianye Hao2,4,5, , Zhaopeng Meng2 , Yingfeng Chen1, , Changjie Fan1 and Yang Liu3,6 1Fuxi Lab, Net Ease 2College of Intelligence and Computing, Tianjin University 3Nanyang Technological University, Singapore 4Noah s Ark Lab, Huawei 5Tianjin Key Lab of Machine Learning 6Institute of Computing Innovation, Zhejiang University
Pseudocode Yes Algorithm 1: EMOGI and Algorithm 2: Diverse-Select
Open Source Code No The paper provides links to supplementary details and videos but does not include an explicit statement or link to the open-source code for the methodology described.
Open Datasets No The paper mentions using 'Atari game pong' and 'commercial game Justice Online (JO)' for evaluation but does not provide concrete access information (link, DOI, or specific citation to a dataset repository) for these game environments or any specific datasets derived from them.
Dataset Splits No The paper does not provide specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit standard split references).
Hardware Specification No The paper does not specify any hardware details (e.g., GPU models, CPU types, or memory) used for running its experiments.
Software Dependencies No The paper mentions methods like A3C and DRL, but it does not specify any software dependencies with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x, Python 3.x).
Experiment Setup No The paper describes the reward function setup and mentions that 'all baselines use the same hyper-parameters defined in [Mnih et al., 2016]' for A3C, referring to an external paper for hyperparameter details rather than providing them directly within the text.