reproducibilityindex.ai

Celebrating Diversity in Shared Multi-Agent Reinforcement Learning

Authors: Chenghao Li, Tonghan Wang, Chengjie Wu, Qianchuan Zhao, Jun Yang, Chongjie Zhang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results show that our method achieves state-of-the-art performance on Google Research Football and super hard Star Craft II micromanagement tasks. We benchmark our approach on Google Research Football (GRF) [18], and Star Craft II micromanagement tasks (SMAC) [16]. We compare our approach against multi-agent value-based methods (QMIX [5], QPLEX [6]), variational exploration (MAVEN [25]), and individuality emergence (EOI [26]) methods. We carry out ablation studies to test the contribution of its three main components.
Researcher Affiliation	Academia	Chenghao Li, Tonghan Wang, Chengjie Wu, Qianchuan Zhao, Jun Yang , Chongjie Zhang Tsinghua University {lich18, wangth18, wucj19}@mails.tsinghua.edu.cn, {zhaoqc, yangjun603, chongjie}@tsinghua.edu.cn
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Videos are available at https://sites.google.com/view/celebrate-diversity-shared with codes.
Open Datasets	Yes	We benchmark our approach on Google Research Football (GRF) [18], and Star Craft II micromanagement tasks (SMAC) [16].
Dataset Splits	No	The paper discusses training and performance evaluation but does not specify explicit train/validation/test dataset splits (percentages, counts, or predefined splits) for reproducibility.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions various algorithms and frameworks (e.g., QPLEX, QMIX), but does not provide specific version numbers for any software dependencies.
Experiment Setup	No	The paper mentions hyperparameters like β and λ but does not provide their specific values or other concrete experimental setup details such as learning rates, batch sizes, or optimizer settings in the main text.