TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient
Authors: Xingzhou Lou, Junge Zhang, Timothy J. Norman, Kaiqi Huang, Yali Du
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiment results on several benchmarks show the agent topology is able to facilitate agent cooperation and alleviate CDM issue respectively to improve performance of TAPE. Finally, multiple ablation studies and a heuristic graph search algorithm are devised to show the efficacy of the agent topology. |
| Researcher Affiliation | Academia | 1School of Artificial Intelligence, University of Chinese Academy of Sciences 2Institute of Automation, Chinese Academy of Sciences 3University of Southampton 4King s College London |
| Pseudocode | Yes | Pseudo-code and more details of stochastic TAPE are provided in the appendix E.1. |
| Open Source Code | Yes | Our code is available here1. 1github.com/LxzGordon/TAPE |
| Open Datasets | Yes | Level-based foraging (Papoudakis et al. 2021) and Starcraft II Multi-Agent Challenge (SMAC) (Samvelyan et al. 2019) |
| Dataset Splits | No | The paper mentions using common benchmarks (LBF, SMAC) but does not explicitly provide details on train/validation/test dataset splits, specific percentages, or sample counts, nor does it cite a resource that defines these splits for its experiments. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, libraries, or solvers used in the experiments. |
| Experiment Setup | No | The paper states that "All algorithms are run for four times with different random seeds. Each run lasts for 5 × 10^6 environmental steps. During training, each algorithm has four parallel environment to collect training data," but it does not explicitly provide specific hyperparameter values (e.g., learning rate, batch size, optimizer settings) or detailed configuration steps for the experimental setup in the main text. |