reproducibilityindex.ai

Optimistic Multi-Agent Policy Gradient

Authors: Wenshuai Zhao, Yi Zhao, Zhiyuan Li, Juho Kannala, Joni Pajarinen

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In extensive evaluations on a diverse set of tasks including the Multi-agent Mu Jo Co and Overcooked benchmarks, our method outperforms strong baselines on 13 out of 19 tested tasks and matches the performance on the rest.
Researcher Affiliation	Academia	1Department of Electrical Engineering and Automation, Aalto University, Finland 2School of Computer Science and Engineering, University of Electronic Science and Technology of China, China 3Department of Computer Science, Aalto University, Finland 4University of Oulu, Finland.
Pseudocode	Yes	Algorithm 1 Optimistic Multi-Agent Proximal Policy Optimization (Opti MAPPO)
Open Source Code	Yes	Source Code: https://github.com/wenshuaizhao/optimappo
Open Datasets	Yes	In extensive evaluations on a diverse set of tasks including the Multi-agent Mu Jo Co and Overcooked benchmarks, our method outperforms strong baselines on 13 out of 19 tested tasks and matches the performance on the rest.
Dataset Splits	No	The paper mentions '100 evaluation episodes' for MA-Mu Jo Co and 'episode length of repeated games as 25' for matrix games, and 'Episode Length 400' for Overcooked tasks. However, it does not explicitly provide percentages or counts for training, validation, or test dataset splits.
Hardware Specification	No	The paper mentions 'computational resources provided by the Aalto Science-IT project and CSC, Finnish IT Center for Science', but it does not specify any exact GPU or CPU models, memory amounts, or detailed computer specifications used for running experiments.
Software Dependencies	No	The paper does not provide specific version numbers for software components or libraries, such as Python, PyTorch, or CUDA.
Experiment Setup	Yes	We use the same hyperparameters listed in Table 5. The implementation is based on the HAPPO (Kuba et al., 2022) codebase, and the other hyperparameters are the default. ... In all the tasks of Overcooked, we use the same hyperparameters listed in Table 6.