Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers
Authors: Luke Marris, Paul Muller, Marc Lanctot, Karl Tuyls, Thore Graepel
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct several experiments using CE meta-solvers for JPSRO and demonstrate convergence on n-player, general-sum games. |
| Researcher Affiliation | Collaboration | 1Deep Mind 2University College London 3Universit e Gustave Eiffel. |
| Pseudocode | Yes | Algorithm 1 Two-Player PSRO ... Algorithm 2 JPSRO |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | All games used are available in Open Spiel (Lanctot et al., 2019). |
| Dataset Splits | No | The paper describes training multi-agent systems within game environments, but does not specify explicit training/validation/test dataset splits with percentages or sample counts for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running experiments. |
| Software Dependencies | No | The paper mentions software like CVXPY, OSQP, and Open Spiel, but does not provide specific version numbers for these dependencies, which are necessary for reproducible setup. |
| Experiment Setup | Yes | We use an exact BR oracle, and exactly evaluate policies in the meta-game by traversing the game tree to precisely isolate the MS s contribution to the algorithm. ... Random solvers were evaluated with five seeds and we plot the mean. ... Experiments were ran for up to 6 hours, after which they were terminated. |