reproducibilityindex.ai

Effective and Stable Role-Based Multi-Agent Collaboration by Structural Information Principles

Authors: Xianghua Zeng, Hao Peng, Angsheng Li

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluations on the Star Craft II micromanagement benchmark demonstrate that, compared with state-of-the-art MARL algorithms, the SR-MARL framework improves the average test win rate by 0.17%, 6.08%, and 3.24%, and reduces the deviation by 16.67%, 30.80%, and 66.30%, under easy, hard, and super hard scenarios.
Researcher Affiliation	Academia	Xianghua Zeng1, Hao Peng1, Angsheng Li1, 2 1 State Key Laboratory of Software Development Environment, Beihang University, Beijing, China; 2 Zhongguancun Laboratory, Beijing, China. {zengxianghua, penghao, angsheng}@buaa.edu.cn, liangsheng@gmail.zgclab.edu.cn.
Pseudocode	Yes	Algorithm 1: The Sparsification Algorithm
Open Source Code	Yes	All source code and data are available at Github2. 2https://github.com/RingBDStack/SR-MARL
Open Datasets	Yes	We evaluate the SR-MARL on the Star Craft II micromanagement (SMAC) benchmark (Samvelyan et al. 2019), a mainstream benchmark of CTDE algorithms, of its rich environment and high control complexity.
Dataset Splits	No	The paper mentions 'test win rates' and 'average test win rates' but does not specify exact training/validation/test splits with percentages, sample counts, or explicit cross-validation details for reproducibility.
Hardware Specification	Yes	All experiments adopt the default settings and are conducted on 3.00GHz Intel Core i9 CPU and NVIDIA RTX A6000 GPU.
Software Dependencies	No	The implementations of the SR-MARL and baselines in our experiments are based on the Py MARL ((Samvelyan et al. 2019)), and the hyperparameters of the baselines have been fine-tuned on the SMAC benchmark. The paper mentions PyMARL but does not provide specific version numbers for it or any other software dependencies like Python, PyTorch, or TensorFlow.
Experiment Setup	No	The paper mentions that hyperparameters were 'fine-tuned' for baselines and that 'default settings' were adopted. However, it does not provide specific concrete hyperparameter values (e.g., learning rate, batch size) or detailed system-level training configurations for the experiments.