Multi-Agent Game Abstraction via Graph Attention Neural Network

Authors: Yong Liu, Weixun Wang, Yujing Hu, Jianye Hao, Xingguo Chen, Yang Gao7211-7218

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments are conducted in Traffic Junction and Predator-Prey. The results indicate that the proposed methods can simplify the learning process and meanwhile get better asymptotic performance compared with state-of-the-art algorithms.
Researcher Affiliation Collaboration Yong Liu,1 Weixun Wang,2 Yujing Hu,3 Jianye Hao,2,4 Xingguo Chen,5 Yang Gao1 1National Key Laboratory for Novel Software Technology, Nanjing University 2Tianjin University, 3Net Ease Fuxi AI Lab, 4Noah s Ark Lab, Huawei 5Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing University of Posts and Telecommunications
Pseudocode No The paper does not contain a clearly labeled pseudocode or algorithm block.
Open Source Code No The paper does not provide any statement about open-sourcing the code or a link to a code repository.
Open Datasets Yes The simulated traffic junction environments from (Singh, Jain, and Sukhbaatar 2019) consists of cars moving along pre-assigned potentially interesting routes on one or more road junctions. As shown in Figure 8(a), we choose predator prey as the test environment, where the adversary agent (red) is slower and needs to capture the good agent (green), and the good agent is faster and needs to escape.
Dataset Splits No The paper describes the number of agents and environment settings (e.g., "number of agents in the easy, medium, and hard environments is 5, 10, and 20, respectively" and "We trained the model in the setting of Na = 5 and Ng = 2 for 1500 episodes"), but it does not specify explicit dataset splits (e.g., percentages or counts for training, validation, and test sets).
Hardware Specification No No specific hardware details such as GPU models, CPU types, or memory specifications were provided for the experimental setup.
Software Dependencies No No specific software dependencies with version numbers were mentioned.
Experiment Setup Yes The number of agents in the easy, medium, and hard environments is 5, 10, and 20, respectively. We make this task harder by always setting vision to zero in all the three difficulty levels. The action space for each car is gas and break, and the reward consists of a linear time penalty 0.01τ, where τ is the number of time-steps since the car has been active, and a collision penalty rcollision = 10. We trained the model in the setting of Na = 5 and Ng = 2 for 1500 episodes, where Na is the number of adversary and Ng is the number of good agents. adversary agents receive a reward of +10 when they capture good agents.