ROMA: Multi-Agent Reinforcement Learning with Emergent Roles
Authors: Tonghan Wang, Heng Dong, Victor Lesser, Chongjie Zhang
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our method can learn specialized, dynamic, and identifiable roles, which help our method push forward the state of the art on the Star Craft II micromanagement benchmark. Demonstrative videos are available at https: //sites.google.com/view/romarl/. We test our method on Star Craft II1 micromanagement en vironments (Vinyals et al., 2017; Samvelyan et al., 2019). Results show that our method significantly pushes forward the state of the art of MARL algorithms, by virtue of the adaptive policy sharing among agents with similar roles. 5. Experiments Our experiments aim to answer the following questions: (1) Whether the learned roles can automatically adapt in dynamic environments? (Sec. 5.1.) (2) Can our method pro mote sub-task specialization? That is, agents with similar responsibilities have similar role embedding representations, while agents with different responsibilities have role embed ding representations far from each other. (Sec. 5.1, 5.3.) (3) Can such sub-task specialization improve the perfor mance of multi-agent reinforcement learning algorithms? (Sec. 5.2.) (4) How do roles evolve during training, and how do they influence team performance? (Sec. 5.4.) (5) Can the dissimilarity model dφ learn to measure the dissimilarity between agents trajectories? (Sec. 5.4.) |
| Researcher Affiliation | Academia | Tonghan Wang 1 Heng Dong 1 Victor Lesser 2 Chongjie Zhang 1 1IIIS, Tsinghua University, Beijing, China 2University of Mas sachusetts, Amherst, USA. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Videos2 of our experiments and the code3 are available online. 3https://github.com/Tonghan Wang/ROMA |
| Open Datasets | Yes | We test our method on Star Craft II1 micromanagement en vironments (Vinyals et al., 2017; Samvelyan et al., 2019). |
| Dataset Splits | No | The paper mentions using the StarCraft II micromanagement environments but does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or counts) or reference predefined splits for this environment. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | We carry out a grid search over the loss coefficients λI and λD, and fix them at 10 4 and 10 2, respectively, across all the experiments. The dimensionality of latent role space is set to 3, so we did not use any dimensionality reduction tech niques when visualizing the role embedding representations. Other hyperparameters are also fixed in our experiments, which are listed in Appendix B.1. |