LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent Reinforcement Learning
Authors: Mingyu Yang, Jian Zhao, Xunhan Hu, Wengang Zhou, Jiangcheng Zhu, Houqiang Li
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that LDSA learns reasonable and effective subtask assignment for better collaboration and significantly improves the learning performance on the challenging Star Craft II micromanagement benchmark and Google Research Football. |
| Researcher Affiliation | Collaboration | 1University of Science and Technology of China, 2Huawei Cloud 3Hefei Comprehensive National Science Center, Institute of Artificial Intelligence |
| Pseudocode | No | The paper describes the proposed method in text and figures (e.g., Figure 1), but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We include the code, data, and instructions needed to reproduce the main experimental results in the supplemental material. |
| Open Datasets | Yes | We evaluate LDSA on the SMAC benchmark [27]... Additional experiments on Google Research Football (GRF) [20]... |
| Dataset Splits | No | The paper evaluates performance based on 'test win rate' over 'timesteps' and 'episodes' in reinforcement learning environments (SMAC and GRF), rather than explicitly defining 'training/test/validation dataset splits' in the manner of static supervised learning datasets. |
| Hardware Specification | No | The paper mentions support by a 'GPU cluster built by MCC Lab of Information Science and Technology Institution, USTC' in the Acknowledgments, but does not provide specific GPU models, CPU models, or detailed hardware specifications within the provided text. |
| Software Dependencies | No | The paper mentions implementing baselines using 'Py MARL [27]' but does not provide specific version numbers for PyMARL or any other software dependencies such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | For all experiments, the number of subtasks is set to 4 and the length of subtask representations is set to 64, i.e., k = 4, m = 64. We carry out a grid search for regularizers coefficients λϕ and λh on the SMAC scenario corridor and then set them to 10 3 and 10 3, respectively, for all scenarios. |