Dealing with Non-Stationarity in MARL via Trust-Region Decomposition

Authors: Wenhao Li, Xiangfeng Wang, Bo Jin, Junjie Sheng, Hongyuan Zha

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 EXPERIMENTS This section aims to verify the effectiveness of the trust-region constraints, the existence of the trust-region decomposition dilemma, and the capacity of the TRD-Net with 4 cooperative tasks Spread, Multi-Walker, Rover-Tower, Pursuit (more details are in the appendix).
Researcher Affiliation Academia School of Computer Science and Technology East China Normal University Shanghai, China {52194501026@stu, xfwang@cs, bjin@cs, 52194501003@stu}.ecnu.edu.cn Hongyuan Zha School of Data Science, The Chinese University of Hong Kong (Shenzhen) Shenzhen Institute of Artificial Intelligence and Robotics for Society Shenzhen, China zhahy@cuhk.edu.cn
Pseudocode Yes Algorithm 1 MAMT
Open Source Code Yes The source code of this paper is available at https:// anonymous.4open.science/r/MAMT.
Open Datasets No The paper describes environments (e.g., Spread, Multi-Walker, Rover-Tower, Pursuit) but does not provide concrete access information (links, DOIs, formal citations for dataset download) for them as publicly available datasets.
Dataset Splits No The paper does not explicitly state training/validation/test dataset splits (e.g., percentages or counts) for reproduction. It describes simulation environments.
Hardware Specification Yes The hardware used in the experiment is a server with 128G memory and 4 NVIDIA 1080Ti graphics cards with 11G video memory.
Software Dependencies No The paper refers to using PyTorch and other related libraries through the provided codebases, but does not explicitly list specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9') for its own implementation.
Experiment Setup Yes Table 3: Default settings of our methods used in experiments.