reproducibilityindex.ai

Coordination Between Individual Agents in Multi-Agent Reinforcement Learning

Authors: Yang Zhang, Qingyu Yang, Dou An, Chengwei Zhang11387-11394

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results show that the proposed method outperforms the state-of-the-art MARL methods and can measure the correlation between individual agents accurately.
Researcher Affiliation	Academia	Yang Zhang,1 Qingyu Yang,1,2,3* Dou An,1,2,3 Chengwei Zhang4 1Faculty of Electronics and Information, Xi an Jiaotong University, Xi an, China 2State Key Laboratory for Manufacturing System Engineering (SKLMSE), Xi an Jiaotong University, Xi an, China 3Ministry of Education (MOE) Key Laboratory for Intelligent Networks and Network Security, Xi an Jiaotong University, Xi an, China 4Dalian Maritime University, Dalian, China
Pseudocode	Yes	Algorithm 1 ALC-MADDPG algorithm
Open Source Code	No	The paper does not provide any explicit statements or links indicating the availability of open-source code for the described methodology.
Open Datasets	Yes	The experiments were performed in four mixed competitivecooperative environments, which can be found in (Lowe et al. 2017; Mordatch and Abbeel 2018).
Dataset Splits	No	The paper refers to 'training episodes' and 'mean received reward of agents in an episode' but does not explicitly specify training, validation, or test dataset splits (e.g., percentages or counts) as commonly defined in supervised learning.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models) used to run the experiments.
Software Dependencies	No	The paper mentions various MARL methods (e.g., MADDPG, VDN, QMIX) and reinforcement learning algorithms (DQN, DDPG), but it does not specify any software libraries or dependencies with version numbers used for implementation.
Experiment Setup	Yes	In each actor and critic network, there were two fully-connected layers, and both of them consisted of 64 neurons. The parameters τ, γ, learning rate, and the size of replay memory were set as 0.1, 0.95, 0.01 and 10, 0000, respectively.