reproducibilityindex.ai

Deep Hierarchical Communication Graph in Multi-Agent Reinforcement Learning

Authors: Zeyang Liu, Lipeng Wan, Xue Sui, Zhuoran Chen, Kewu Sun, Xuguang Lan

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results show that our method improves performance across various cooperative multi-agent tasks, including Predator Prey, Multi-Agent Coordination Challenge, and Star Craft Multi-Agent Challenge. ... In this section, we conduct empirical experiments to answer the following questions: (1) Is Deep Hierarchical Communication Graph (DHCG) better than the existing MARL methods... (2) Can DHCG outperforms the pre-deﬁned topologies or existing graph-based methods? (3) How does DHCG differ from communicationenabled algorithms? (4) Can DHCG generate different graphs to adapt to different situations?
Researcher Affiliation	Academia	Zeyang Liu1 , Lipeng Wan1 , Xue Sui1 , Zhuoran Chen1 , Kewu Sun2 and Xuguang Lan1 1National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artiﬁcial Intelligence and Robotics, Xi an Jiaotong University 2Intelligent Science & Technology Academy {zeyang.liu, wanlipeng, suixue98, zhuoran.chen}@stu.xjtu.edu.cn, sun kewu@126.com, xglan@mail.xjtu.edu.cn
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	Yes	In this section, we compare the performance of MAPPO [Yu et al., 2022], HAPPO [Kuba et al., 2022], QMIX [Rashid et al., 2018], DCG [B ohmer et al., 2020], CASEC, SOP-CG [Yang et al., 2022], and DHCG on Predator Prey [Son et al., 2019], Multi-Agent Coordination Challenge (MACO) [Wang et al., 2022], and Star Craft Multi-Agent Challenge (SMAC) [Samvelyan et al., 2019].
Dataset Splits	No	The paper uses standard benchmark datasets but does not explicitly provide specific train/validation/test dataset splits (e.g., percentages or sample counts) needed for reproduction. It mentions running 'ﬁve independent runs with different random seeds'.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments (e.g., specific GPU/CPU models, memory, or cloud instance types).
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We use an ϵ-greedy exploration scheme, where ϵ decreases from 1 to 0.05 over 50 thousand timesteps in 10m vs 11m and MMM2, and over 1 million timesteps in corridor and 3s5z vs 3s6z.