Controlling Behavioral Diversity in Multi-Agent Reinforcement Learning

Authors: Matteo Bettini, Ryan Kortvelesy, Amanda Prorok

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We theoretically prove that Di Co achieves the desired diversity, and we provide several experiments, both in cooperative and competitive tasks, that show how Di Co can be employed as a novel paradigm to increase performance and sample efficiency in MARL. Multimedia results are available on the paper s website1.
Researcher Affiliation Academia 1Department of Computer Science, University of Cambridge, Cambridge, UK. Correspondence to: Matteo Bettini <mb2389@cl.cam.ac.uk>.
Pseudocode Yes The Di Co pseudocode is reported in Sec. G.1. In Alg. 1, we present the pseudocode for policy evaluation in Di Co.
Open Source Code Yes The code and running instructions are available on Git Hub at https://github.com/proroklab/ Controlling Behavioral Diversity.
Open Datasets Yes We consider the tasks in Fig. 3 from the VMAS simulator (Bettini et al., 2022). Training is performed in the Bench MARL library (Bettini et al., 2023a) using Torch RL (Bou et al., 2024) in the backend.
Dataset Splits No The paper mentions using 'training data batches' and a 'replay buffer' but does not specify distinct training, validation, and test dataset splits with percentages or sample counts.
Hardware Specification Yes For the purpose of rapid experimentation, we estimate 500 compute hours using an NVIDIA GeForce RTX 2080 Ti GPU and an Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz. For the purpose of running the final experiments on multiple seeds, we estimate 2000 HPC compute hours using an NVIDIA A100-SXM-80GB GPU and 32 cores of a AMD EPYC 7763 64-Core Processor CPU @ 1.8GHz .
Software Dependencies No The paper mentions software like Bench MARL, Torch RL, and Hydra and provides citations to their respective papers. However, it does not provide specific version numbers for these software components.
Experiment Setup Yes Experiment configurations use the Bench MARL (Bettini et al., 2023a) configuration structure, leveraging Hydra (Yadan, 2019) to decouple YAML configuration files from the Python codebase. Configuration parameters can be found in the conf folder, sorted in experiment, a l g o r i t h m, t a s k , and model sub-folders. Each file has thorough documentation explaining the effect and meaning of each hyperparameter.