reproducibilityindex.ai

Attention-Guided Contrastive Role Representations for Multi-agent Reinforcement Learning

Authors: Zican Hu, Zongzhang Zhang, Huaxiong Li, Chunlin Chen, Hongyu Ding, Zhi Wang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on challenging Star Craft II micromanagement and Google research football tasks demonstrate the state-of-the-art performance of our method and its advantages over existing approaches.
Researcher Affiliation	Academia	1 Department of Control Science and Intelligent Engineering, Nanjing University 2 School of Artificial Intelligence, Nanjing University
Pseudocode	Yes	Based on the implementations in Section 2, we summarize the brief procedure of ACORM based on QMIX in Algorithm 1.
Open Source Code	Yes	Our code is available at https://github.com/NJU-RL/ACORM.
Open Datasets	Yes	ACORM on top of two popular MARL algorithms, QMIX (Rashid et al., 2020) and MAPPO (Yu et al., 2022), benchmarked on challenging Star Craft multi-agent challenge (SMAC) (Samvelyan et al., 2019) and Google research football (GRF) (Kurach et al., 2020) environments.
Dataset Splits	No	The paper mentions evaluating 'test win rate' and using a replay buffer, but does not explicitly describe train/validation/test dataset splits (e.g., percentages or specific counts for each split).
Hardware Specification	No	The paper does not explicitly mention any specific hardware specifications (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies	No	The paper mentions using 'Adam optimizer' and 'Re LU as the activation function' and building on 'QMIX' and 'MAPPO', but does not provide specific version numbers for any software components or libraries.
Experiment Setup	Yes	Table 2: Hyperparameters used for ACORM based on QMIX. It lists values for 'buffer size', 'batch size', 'learning rate', 'start epsilon', 'epsilon decay steps', 'discount factor', and other training parameters.