Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Towards a Standardised Performance Evaluation Protocol for Cooperative MARL
Authors: Rihab Gorsane, Omayma Mahjoub, Ruan John de Kock, Roland Dubb, Siddarth Singh, Arnu Pretorius
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | By conducting a detailed meta-analysis of prior work, spanning 75 papers accepted for publication from 2016 to 2022, we bring to light worrying trends that put into question the true rate of progress. We further consider these trends in a wider context and take inspiration from single-agent RL literature on similar issues with recommendations that remain applicable to MARL. Combining these recommendations, with novel insights from our analysis, we propose a standardised performance evaluation protocol for cooperative MARL. Finally, we release our meta-analysis data publicly on our project website for future research on evaluation 3 accompanied by our open-source evaluation tools repository4. |
| Researcher Affiliation | Collaboration | Rihab Gorsane1 Omayma Mahjoub12 Ruan de Kock1 Roland Dubb13 Siddarth Singh1 Arnu Pretorius1 1Insta Deep 2National School of Computer Science, Tunisia 3University of Cape Town, South Africa |
| Pseudocode | No | The paper presents its proposed protocol in a bulleted list format within a blue box, but it is a set of recommendations and guidelines, not a formal pseudocode or algorithm block. |
| Open Source Code | Yes | Finally, we release our meta-analysis data publicly on our project website for future research on evaluation 3 accompanied by our open-source evaluation tools repository4. 4https://github.com/instadeepai/marl-eval |
| Open Datasets | Yes | Finally, we release our meta-analysis data publicly on our project website for future research on evaluation 3 accompanied by our open-source evaluation tools repository4. In total, we collected data from 75 cooperative MARL papers accepted for publication... We believe this dataset is the first of its kind and we have made it publicly available for further analysis. |
| Dataset Splits | No | This paper conducts a meta-analysis of existing papers and does not involve training machine learning models with specific dataset splits for training, validation, and testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used to conduct its meta-analysis. |
| Software Dependencies | No | The paper provides a link to an open-source evaluation tools repository but does not list specific software dependencies with version numbers used for its meta-analysis. |
| Experiment Setup | No | The paper describes the parameters for the *recommended* evaluation protocol for MARL, but not the specific experimental setup (e.g., hyperparameters, training configurations) for *its own* meta-analysis. |