Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts
Authors: Weinan Zhang, Xihuai Wang, Jian Shen, Ming Zhou
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical experiments on competitive and cooperative tasks demonstrate that AORPO can achieve improved sample efficiency with comparable asymptotic performance over the compared MARL methods. |
| Researcher Affiliation | Academia | Weinan Zhang , Xihuai Wang , Jian Shen , Ming Zhou Shanghai Jiao Tong University {wnzhang, leoxhwang, r ocky, mingak}@sjtu.edu.cn |
| Pseudocode | Yes | Algorithm 1: AORPO Algorithm |
| Open Source Code | No | No explicit statement providing concrete access to source code (e.g., repository link, explicit code release statement) for the methodology described in this paper was found. |
| Open Datasets | Yes | Based on a multi-agent particle environment, we evaluate our method in two types of cooperative tasks... Multi-Agent Particle Environment [Lowe et al., 2017] |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning into train/validation/test sets. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions algorithms like MASAC and MADDPG but does not provide specific software names with version numbers (e.g., Python 3.8, PyTorch 1.9) needed to replicate the experiment. |
| Experiment Setup | Yes | Other implementation details, including network architectures and important hyperparameters, are provided in Appendix F. |