reproducibilityindex.ai

Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games

Authors: Dingyang Chen, Yile Li, Qi Zhang

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical results demonstrate the effectiveness of these innovations when instantiated with a state-of-the-art CTDE algorithm, achieving competitive policy performance with only a fraction of communication during training. and 5 EXPERIMENTS Our experiments aim to answer the following questions in Sections 5.1-5.3, respectively: 1) How communication-efﬁcient is our algorithm proposed in Section 4 against baselines and ablations? 2) How empirically effective is policy consensus? 3) What are the qualitative properties of the learned communication rules?
Researcher Affiliation	Academia	Dingyang Chen1, Yile Li, Qi Zhang1 1 Artiﬁcial Intelligence Institute, University of South Carolina 1 dingyang@email.sc.edu, qz5@cse.sc.edu
Pseudocode	Yes	E.2 PSEUDOCODE Algorithm 1 Pseudocode of our communication-efﬁcient actor-critic algorithm
Open Source Code	No	No explicit statement or link indicating the release of open-source code for the methodology described in the paper.
Open Datasets	Yes	Environments. We evaluate our algorithm on three tasks in Multi-Agent Particle Environment (MPE) with the efﬁcient implementation by Liu et al. (2020), each of which has a version with N = 15 agents and another with N = 30 agents. As described in Section 3.1, these MPE environments can be cast as homogeneous MGs provided full observability and the permutation preserving observation functions.
Dataset Splits	No	No explicit train/validation/test dataset splits are provided, as the experiments are conducted in a simulation environment (MPE) where data is generated during training rather than from a static dataset.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory specifications) are provided for the experimental setup.
Software Dependencies	No	Table 2 mentions optimizers like 'Adam' and components like 'GCN' and 'Gumbel-Softmax', but it does not specify version numbers for any software dependencies (e.g., Python, PyTorch/TensorFlow, specific library versions).
Experiment Setup	Yes	E.3 HYPERPARAMETERS Table 2: Hyperparameters lists detailed settings such as 'Episode length 25', 'Number of training episodes 40000', 'Discount factor 0.95', 'Batch size from replay buffer 256', and optimizer learning rates, among others.