MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning

Authors: Mikayel Samvelyan, Akbir Khan, Michael D Dennis, Minqi Jiang, Jack Parker-Holder, Jakob Nicolaus Foerster, Roberta Raileanu, Tim Rocktäschel

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that MAESTRO outperforms a number of strong baselines on competitive two-player games, spanning discrete and continuous control settings.
Researcher Affiliation Collaboration 1Meta AI 2University College London 3UC Berkeley 4University of Oxford samvelyan@meta.com
Pseudocode Yes Algorithm 1 provides its pseudocode.
Open Source Code No The paper does not provide a specific link to source code for the described methodology or an explicit statement of code release.
Open Datasets Yes Laser Tag is a grid-based, two-player zero-sum game proposed by Lanctot et al. (2017), Multi Car Racing (MCR, Schwarting et al., 2021), MCR test environments are the Formula 1 Car Racing tracks from (Jiang et al., 2021a).
Dataset Splits Yes We selected the best performing settings based on the average return on the unseen validation levels against previously unseen opponents on at least 5 seeds.
Hardware Specification Yes All experiments are performed on an internal cluster. Each job (representing a seed) is performed with a single Tesla V100 GPU and 10 CPUs.
Software Dependencies No The paper mentions software like PPO, Griddly, and PyTorch but does not provide specific version numbers for these dependencies as used in their experiments.
Experiment Setup Yes Table 2 summarises our final hyperparameter choices for all methods.