Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
CADP: Towards Better Centralized Learning for Decentralized Execution in MARL
Authors: Yihe Zhou, Shunyu Liu, Yunpeng Qing, Tongya Zheng, Kaixuan Chen, Jie Song, Mingli Song
IJCAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations on different benchmarks and across various MARL backbones demonstrate that the proposed framework achieves superior performance compared with the state-of-the-art counterparts. Our code is available at https://github.com/zyh1999/CADP To demonstrate the effectiveness of the proposed CADP framework, we conduct experiments on the Star Craft II micromanagement challenge and Google Research Football benchmark. |
| Researcher Affiliation | Academia | 1Zhejiang University 2Zhejiang Provincial Engineering Research Center for Real-Time Smart Tech in Urban Security Governance, School of Computer and Computing Science, Hangzhou City University 3State Key Laboratory of Blockchain and Data Security, Zhejiang University 4Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security EMAIL, doujiang EMAIL |
| Pseudocode | Yes | In addition, we provide pseudocode in Appendix D. |
| Open Source Code | Yes | Our code is available at https://github.com/zyh1999/CADP |
| Open Datasets | Yes | To demonstrate the effectiveness of the proposed CADP framework, we conduct experiments on the Star Craft II micromanagement challenge and Google Research Football benchmark. |
| Dataset Splits | No | The paper discusses various scenarios from the Star Craft II micromanagement challenge and Google Research Football benchmark (e.g., "3s5z vs 3s6z", "corridor", "3 vs 1 with keeper scenario"), but it does not provide specific details on how the dataset was split into training, validation, or test sets in terms of percentages or sample counts. It refers to "learning curves" and evaluation, but the split methodology is not explicitly stated. |
| Hardware Specification | No | The paper mentions "advanced computing resources provided by the Supercomputing Center of Hangzhou City University" in the acknowledgments, but it does not specify any particular hardware components such as GPU models, CPU types, or memory amounts used for the experiments. |
| Software Dependencies | No | The paper mentions the use of various MARL methods and frameworks like QMIX, VDN, QPLEX, and MAPPO, but it does not provide specific version numbers for these or any underlying software libraries (e.g., Python, PyTorch, TensorFlow, CUDA) that would be needed for reproducibility. |
| Experiment Setup | Yes | The detailed hyperparameters are given in Appendix B. We examine the effect of the coefficient α in 3s5z vs 3s6z scenarios in Figure 5. In GRF benchmark, we set T = 3M in 3 vs 1 with keeper scenario and T = 6M in counterattack easy scenario respectively. |