Complementary Attention for Multi-Agent Reinforcement Learning

Authors: Jianzhun Shao, Hongchang Zhang, Yun Qu, Chang Liu, Shuncheng He, Yuhang Jiang, Xiangyang Ji

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method on three commonly used benchmarks: Star Craft II (SC2) (Samvelyan et al., 2019), Multiagent Particle Environment (MPE) (Lowe et al., 2017), and Traffic Junction (Sukhbaatar et al., 2016). The proposed CAMA outperforms SOTA methods significantly on all conducted experiments and exhibits remarkable robustness to sight range variation and dynamic team composition.
Researcher Affiliation Academia Department of Automation, Tsinghua University, Beijing, China. Correspondence to: Xiangyang Ji <xyji@tsinghua.edu.cn>.
Pseudocode No The paper does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes We submit the code of all the experiments in a github repository: https://github.com/qyz55/CAMA.
Open Datasets No The paper utilizes established simulation environments like StarCraft II, Multiagent Particle Environment, and a modified Traffic Junction. While these environments are well-known or described, the paper does not provide specific links or access information to *datasets* (i.e., collections of static data files) that are publicly available for download.
Dataset Splits No The paper mentions training and testing phases and specifies parameters for each (e.g., agent numbers for training vs. testing in Resource Collection), but it does not provide explicit details on train/validation/test dataset splits, such as percentages, absolute sample counts, or specific predefined splits with citations for data partitioning.
Hardware Specification Yes We ran experiments on 2 GPU servers, with each one having 8*RTX3090TI GPUS and 2*AMD EPYC 7H12 CPUs. Each experiment (one seed) takes 12-24 hours on one GPU.
Software Dependencies No The paper does not explicitly list specific software dependencies with version numbers, such as programming languages, libraries, or frameworks used for implementation (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We summarize the hyperparameters in Table. 6. Examples include: γ Discounted factor 0.99, lr Learning rate 0.0005, nbatch Batch size 32, λ1 Weight for LIM 0.005, λ2 Weight for LMI 0.1.