Controlling Behavioral Diversity in Multi-Agent Reinforcement Learning
Authors: Matteo Bettini, Ryan Kortvelesy, Amanda Prorok
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We theoretically prove that Di Co achieves the desired diversity, and we provide several experiments, both in cooperative and competitive tasks, that show how Di Co can be employed as a novel paradigm to increase performance and sample efficiency in MARL. Multimedia results are available on the paper s website1. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Cambridge, Cambridge, UK. Correspondence to: Matteo Bettini <mb2389@cl.cam.ac.uk>. |
| Pseudocode | Yes | The Di Co pseudocode is reported in Sec. G.1. In Alg. 1, we present the pseudocode for policy evaluation in Di Co. |
| Open Source Code | Yes | The code and running instructions are available on Git Hub at https://github.com/proroklab/ Controlling Behavioral Diversity. |
| Open Datasets | Yes | We consider the tasks in Fig. 3 from the VMAS simulator (Bettini et al., 2022). Training is performed in the Bench MARL library (Bettini et al., 2023a) using Torch RL (Bou et al., 2024) in the backend. |
| Dataset Splits | No | The paper mentions using 'training data batches' and a 'replay buffer' but does not specify distinct training, validation, and test dataset splits with percentages or sample counts. |
| Hardware Specification | Yes | For the purpose of rapid experimentation, we estimate 500 compute hours using an NVIDIA GeForce RTX 2080 Ti GPU and an Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz. For the purpose of running the final experiments on multiple seeds, we estimate 2000 HPC compute hours using an NVIDIA A100-SXM-80GB GPU and 32 cores of a AMD EPYC 7763 64-Core Processor CPU @ 1.8GHz . |
| Software Dependencies | No | The paper mentions software like Bench MARL, Torch RL, and Hydra and provides citations to their respective papers. However, it does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | Experiment configurations use the Bench MARL (Bettini et al., 2023a) configuration structure, leveraging Hydra (Yadan, 2019) to decouple YAML configuration files from the Python codebase. Configuration parameters can be found in the conf folder, sorted in experiment, a l g o r i t h m, t a s k , and model sub-folders. Each file has thorough documentation explaining the effect and meaning of each hyperparameter. |