reproducibilityindex.ai

ROMA: Multi-Agent Reinforcement Learning with Emergent Roles

Authors: Tonghan Wang, Heng Dong, Victor Lesser, Chongjie Zhang

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that our method can learn specialized, dynamic, and identiﬁable roles, which help our method push forward the state of the art on the Star Craft II micromanagement benchmark. Demonstrative videos are available at https: //sites.google.com/view/romarl/. We test our method on Star Craft II1 micromanagement en vironments (Vinyals et al., 2017; Samvelyan et al., 2019). Results show that our method signiﬁcantly pushes forward the state of the art of MARL algorithms, by virtue of the adaptive policy sharing among agents with similar roles. 5. Experiments Our experiments aim to answer the following questions: (1) Whether the learned roles can automatically adapt in dynamic environments? (Sec. 5.1.) (2) Can our method pro mote sub-task specialization? That is, agents with similar responsibilities have similar role embedding representations, while agents with different responsibilities have role embed ding representations far from each other. (Sec. 5.1, 5.3.) (3) Can such sub-task specialization improve the perfor mance of multi-agent reinforcement learning algorithms? (Sec. 5.2.) (4) How do roles evolve during training, and how do they inﬂuence team performance? (Sec. 5.4.) (5) Can the dissimilarity model dφ learn to measure the dissimilarity between agents trajectories? (Sec. 5.4.)
Researcher Affiliation	Academia	Tonghan Wang 1 Heng Dong 1 Victor Lesser 2 Chongjie Zhang 1 1IIIS, Tsinghua University, Beijing, China 2University of Mas sachusetts, Amherst, USA.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Videos2 of our experiments and the code3 are available online. 3https://github.com/Tonghan Wang/ROMA
Open Datasets	Yes	We test our method on Star Craft II1 micromanagement en vironments (Vinyals et al., 2017; Samvelyan et al., 2019).
Dataset Splits	No	The paper mentions using the StarCraft II micromanagement environments but does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or counts) or reference predefined splits for this environment.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment.
Experiment Setup	Yes	We carry out a grid search over the loss coefﬁcients λI and λD, and ﬁx them at 10 4 and 10 2, respectively, across all the experiments. The dimensionality of latent role space is set to 3, so we did not use any dimensionality reduction tech niques when visualizing the role embedding representations. Other hyperparameters are also ﬁxed in our experiments, which are listed in Appendix B.1.