Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Deep Hierarchical Communication Graph in Multi-Agent Reinforcement Learning
Authors: Zeyang Liu, Lipeng Wan, Xue Sui, Zhuoran Chen, Kewu Sun, Xuguang Lan
IJCAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that our method improves performance across various cooperative multi-agent tasks, including Predator Prey, Multi-Agent Coordination Challenge, and Star Craft Multi-Agent Challenge. ... In this section, we conduct empirical experiments to answer the following questions: (1) Is Deep Hierarchical Communication Graph (DHCG) better than the existing MARL methods... (2) Can DHCG outperforms the pre-defined topologies or existing graph-based methods? (3) How does DHCG differ from communicationenabled algorithms? (4) Can DHCG generate different graphs to adapt to different situations? |
| Researcher Affiliation | Academia | Zeyang Liu1 , Lipeng Wan1 , Xue Sui1 , Zhuoran Chen1 , Kewu Sun2 and Xuguang Lan1 1National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi an Jiaotong University 2Intelligent Science & Technology Academy EMAIL, sun EMAIL, EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | In this section, we compare the performance of MAPPO [Yu et al., 2022], HAPPO [Kuba et al., 2022], QMIX [Rashid et al., 2018], DCG [B ohmer et al., 2020], CASEC, SOP-CG [Yang et al., 2022], and DHCG on Predator Prey [Son et al., 2019], Multi-Agent Coordination Challenge (MACO) [Wang et al., 2022], and Star Craft Multi-Agent Challenge (SMAC) [Samvelyan et al., 2019]. |
| Dataset Splits | No | The paper uses standard benchmark datasets but does not explicitly provide specific train/validation/test dataset splits (e.g., percentages or sample counts) needed for reproduction. It mentions running 'five independent runs with different random seeds'. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments (e.g., specific GPU/CPU models, memory, or cloud instance types). |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We use an ϵ-greedy exploration scheme, where ϵ decreases from 1 to 0.05 over 50 thousand timesteps in 10m vs 11m and MMM2, and over 1 million timesteps in corridor and 3s5z vs 3s6z. |