MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning
Authors: Mikayel Samvelyan, Akbir Khan, Michael D Dennis, Minqi Jiang, Jack Parker-Holder, Jakob Nicolaus Foerster, Roberta Raileanu, Tim Rocktäschel
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that MAESTRO outperforms a number of strong baselines on competitive two-player games, spanning discrete and continuous control settings. |
| Researcher Affiliation | Collaboration | 1Meta AI 2University College London 3UC Berkeley 4University of Oxford samvelyan@meta.com |
| Pseudocode | Yes | Algorithm 1 provides its pseudocode. |
| Open Source Code | No | The paper does not provide a specific link to source code for the described methodology or an explicit statement of code release. |
| Open Datasets | Yes | Laser Tag is a grid-based, two-player zero-sum game proposed by Lanctot et al. (2017), Multi Car Racing (MCR, Schwarting et al., 2021), MCR test environments are the Formula 1 Car Racing tracks from (Jiang et al., 2021a). |
| Dataset Splits | Yes | We selected the best performing settings based on the average return on the unseen validation levels against previously unseen opponents on at least 5 seeds. |
| Hardware Specification | Yes | All experiments are performed on an internal cluster. Each job (representing a seed) is performed with a single Tesla V100 GPU and 10 CPUs. |
| Software Dependencies | No | The paper mentions software like PPO, Griddly, and PyTorch but does not provide specific version numbers for these dependencies as used in their experiments. |
| Experiment Setup | Yes | Table 2 summarises our final hyperparameter choices for all methods. |