Celebrating Diversity in Shared Multi-Agent Reinforcement Learning
Authors: Chenghao Li, Tonghan Wang, Chengjie Wu, Qianchuan Zhao, Jun Yang, Chongjie Zhang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that our method achieves state-of-the-art performance on Google Research Football and super hard Star Craft II micromanagement tasks. We benchmark our approach on Google Research Football (GRF) [18], and Star Craft II micromanagement tasks (SMAC) [16]. We compare our approach against multi-agent value-based methods (QMIX [5], QPLEX [6]), variational exploration (MAVEN [25]), and individuality emergence (EOI [26]) methods. We carry out ablation studies to test the contribution of its three main components. |
| Researcher Affiliation | Academia | Chenghao Li, Tonghan Wang, Chengjie Wu, Qianchuan Zhao, Jun Yang , Chongjie Zhang Tsinghua University {lich18, wangth18, wucj19}@mails.tsinghua.edu.cn, {zhaoqc, yangjun603, chongjie}@tsinghua.edu.cn |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Videos are available at https://sites.google.com/view/celebrate-diversity-shared with codes. |
| Open Datasets | Yes | We benchmark our approach on Google Research Football (GRF) [18], and Star Craft II micromanagement tasks (SMAC) [16]. |
| Dataset Splits | No | The paper discusses training and performance evaluation but does not specify explicit train/validation/test dataset splits (percentages, counts, or predefined splits) for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions various algorithms and frameworks (e.g., QPLEX, QMIX), but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | No | The paper mentions hyperparameters like β and λ but does not provide their specific values or other concrete experimental setup details such as learning rates, batch sizes, or optimizer settings in the main text. |