reproducibilityindex.ai

SPD: Synergy Pattern Diversifying Oriented Unsupervised Multi-agent Reinforcement Learning

Authors: Yuhang Jiang, Jianzhun Shao, Shuncheng He, Hongchang Zhang, Xiangyang Ji

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we show the capacity of SPD to acquire meaningful coordination policies, such as maintaining specific formations in Multi-Agent Particle Environment and passand-shoot in Google Research Football. Furthermore, we demonstrate that the same instructive pretrained policy s parameters can serve as a good initialization for a series of downstream tasks policies, achieving higher data efficiency and outperforming state-of-the-art approaches in Google Research Football.
Researcher Affiliation	Academia	Yuhang Jiang , Jianzhun Shao , Shuncheng He, Hongchang Zhang, Xiangyang Ji Department of Automation Tsinghua University, Beijing, China
Pseudocode	Yes	Algorithm 1 SPD
Open Source Code	Yes	Our code is available at https://github.com/thu-rllab/SPD.
Open Datasets	Yes	We first train SPD on the complicated MARL environment: Google Research Football [18] without environment reward. ... we first evaluate the diversity of coordination policies learned by SPD and URL baselines in Multi-agent Particle Environment2 [22, 45].
Dataset Splits	No	The paper describes the scenarios and environments used but does not specify explicit train/validation/test splits or their proportions.
Hardware Specification	No	The paper does not provide specific details about the hardware used for experiments (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies	No	The paper mentions software components like QMIX, Sinkhorn-Knopp algorithm, and Kuhn Munkres algorithm, but it does not specify any version numbers for these or other software dependencies.
Experiment Setup	Yes	The hyper-parameters are kept to be the same, and please refer to Appendix B.2 for details.