reproducibilityindex.ai

Sample-Efficient Multiagent Reinforcement Learning with Reset Replay

Authors: Yaodong Yang, Guangyong Chen, Jianye Hao, Pheng-Ann Heng

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments in SMAC and MPE show that MARR significantly improves the performance of various MARL approaches with much fewer environment interactions.
Researcher Affiliation	Collaboration	1Department of CSE, CUHK 2Zhejiang Lab 3Shenzhen Institutes of Advanced Technology, CAS 4Tianjin University 5Noah s Ark Lab, Huawei 6Institute of Medical Intelligence and XR, CUHK.
Pseudocode	Yes	Algorithm 1 Multiagent Reinforcement Learning with Reset Replay (MARR)
Open Source Code	Yes	Code is available at Git Hub.
Open Datasets	Yes	In this section, we validate MARR1 on both the Star Craft Multi-Agent Challenge (SMAC) (Samvelyan et al., 2019) with discrete action space and the Multiagent Particle Environment (MPE) (Lowe et al., 2017) with continuous action space.
Dataset Splits	No	The paper describes using standard benchmark environments (SMAC and MPE) and evaluation metrics (test win rate, episodic return) over multiple independent runs. However, it does not provide explicit numerical details for training/validation/test dataset splits, as the data is generated through environment interaction rather than being from a fixed, pre-split dataset.
Hardware Specification	No	The paper mentions running experiments in parallel environments (e.g., 'number of parallel environments to 8'), but it does not specify any particular hardware details such as GPU models, CPU types, or cloud computing instances used for these experiments.
Software Dependencies	Yes	The SMAC environment is with discrete action space and the used version of Star Craft II is 4.6.2. We implement MARR based on the pymarl framework (Samvelyan et al., 2019).
Experiment Setup	Yes	For all the tasks, we set α at 0.8 and the reset interval TR at 2000 for Shrink & Perturb, and set a at 0.8 and b at 1.2 for the random amplitude scale.