Hierarchical Multi-Agent Skill Discovery

Authors: Mingyu Yang, Yaodong Yang, Zhenbo Lu, Wengang Zhou, Houqiang Li

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate HMASD on sparse reward multi-agent benchmarks, and the results show that HMASD achieves significant performance improvements compared to strong MARL baselines. (Abstract) and In this section, we evaluate the effectiveness of our method. We first conduct a case study to show how HMASD effectively learns diverse useful skills and combines them to complete the task. Then, we compare HMASD with strong MARL baselines on two challenging sparse reward multi-agent benchmarks, i.e., SMAC [40] with 0-1 reward and Overcooked [41]. We further perform ablation studies for HMASD to confirm the benefits of components in our method. (Section 4, Experiments)
Researcher Affiliation Academia Mingyu Yang1, Yaodong Yang2 , Zhenbo Lu3 , Wengang Zhou1,3, Houqiang Li1,3 1University of Science and Technology of China, 2Institute for AI, Peking University 3Institute of Artificial Intelligence, Hefei Comprehensive National Science Center ymy@mail.ustc.edu.cn, yaodong.yang@pku.edu.cn luzhenbo@iai.ustc.edu.cn, {zhwg,lihq}@ustc.edu.cn
Pseudocode Yes A Pseudo Code of Hierarchical Multi-Agent Skill Discovery Algorithm 1: Hierarchical Multi-Agent Skill Discovery (Appendix A)
Open Source Code No No explicit statement about providing the open-source code for HMASD, nor a link to a repository.
Open Datasets Yes Then, we compare HMASD with strong MARL baselines on two challenging sparse reward multi-agent benchmarks, i.e., SMAC [40] with 0-1 reward and Overcooked [41]. (Section 4, Experiments)
Dataset Splits No No explicit training/validation/test dataset splits are specified. The paper mentions "eval episodes" and "eval rollout threads" but this refers to evaluation settings for reinforcement learning policies rather than dataset splits.
Hardware Specification No This work is supported by National Key R&D Program of China under Contract 2022ZD0119802, and National Natural Science Foundation of China under Contract 61836011. It was also supported by GPU cluster built by MCC Lab of Information Science and Technology Institution, USTC, and the Supercomputing Center of the USTC. (Acknowledgments). This only mentions "GPU cluster" without specific models or quantities.
Software Dependencies No No specific software dependencies with version numbers are provided.
Experiment Setup Yes The hyperparameter setting can be found in Appendix E. Table 1: Common hyperparameters used for HMASD, MAT and MAPPO across all tasks. Table 2: Common hyperparameters used for HMASD, MAT and MAPPO in different tasks. Table 3: Different hyperparameters used for HMASD in different scenarios. (Appendix E) and lists specific values in those tables.