reproducibilityindex.ai

Improving Multi-agent Reinforcement Learning with Stable Prefix Policy

Authors: Yue Deng, Zirui Wang, Yin Zhang

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We scale our approach to various value-based MARL methods and empirically verify our method in a cooperative MARL task, SMAC benchmarks. Experimental results demonstrate that our method achieves not only better performance but also faster convergence speed than baseline algorithms within early time steps.
Researcher Affiliation	Academia	College of Computer Science and Technology, Zhejiang University {devindeng, ziseoiwong, zhangyin98}@zju.edu.cn
Pseudocode	Yes	The pseudo-code is provided in Appendix A.
Open Source Code	No	The paper does not provide a direct link to its source code or explicitly state that its code is open-sourced or available in supplementary materials. It mentions codebases for baselines, but not for its own implementation.
Open Datasets	Yes	We evaluate the performance of our method via the fully cooperative Star Craft II micro-management challenges by the mean winning rate in each scenario... SMAC: We verify our proposed stable prefix policy methods on 6 subtasks of two difficulties... The details of other SMAC tasks are shown in Appendix B.
Dataset Splits	No	The paper does not explicitly describe a validation dataset split (e.g., percentages or specific counts) from its experimental environment (SMAC benchmarks). While it discusses training and testing, a distinct validation set split is not detailed.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions external codebases used for baselines (e.g., "QMIX, QPLEX, and W-QMIX in this paper are from pymarl codebase [Hu et al., 2021]" or "MACPF is from the codebase [Zhang et al., 2021; Wang et al., 2023]"), but it does not specify version numbers for any key software components or libraries (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	Other hyper-parameters are in Appendix C.