reproducibilityindex.ai

$\rm E(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning

Authors: Dingyang Chen, Qi Zhang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	As a result, our method achieves superior sample efficiency and generalization performance in a range of benchmark MARL tasks that exhibit continuous E(3)-symmetries but were not accommodated by prior work. and 6. Experiments Environments. We choose the popular cooperative MARL benchmarks of MPE, Mu Jo Co continuous control tasks (Mu Jo Co tasks), including the 2D ones from Tassa et al. (2018) and 3D ones from Chen et al. (2023) with singleand multi-agent variations, and Star Craft Multi-Agent Challenge (SMAC) (Samvelyan et al., 2019) to evaluate the effectiveness of our E(3)-equivariant multi-agent actor-critic methods described in Section 5.
Researcher Affiliation	Academia	Artificial Intelligence Institute, University of South Carolina, Columbia, SC, USA. Correspondence to: Dingyang Chen <dingyang@email.sc.edu>, Qi Zhang <qz5@cse.sc.edu>.
Pseudocode	No	The paper describes its methods and architectures using text and diagrams (e.g., Figure 3), but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Our code is publicly available at https://github.com/dchen48/E3AC.
Open Datasets	Yes	Environments. We choose the popular cooperative MARL benchmarks of MPE, Mu Jo Co continuous control tasks (Mu Jo Co tasks), including the 2D ones from Tassa et al. (2018) and 3D ones from Chen et al. (2023) with singleand multi-agent variations, and Star Craft Multi-Agent Challenge (SMAC) (Samvelyan et al., 2019) to evaluate the effectiveness of our E(3)-equivariant multi-agent actor-critic methods described in Section 5.
Dataset Splits	No	The paper describes training and evaluation metrics (e.g., 'Number of training episodes', '#episodes per evaluation' in the hyperparameters tables) typical for reinforcement learning, but it does not specify explicit train/validation/test dataset splits with percentages or sample counts for data used in a supervised learning context.
Hardware Specification	Yes	The code is implemented by Py Torch, and runs on NVIDIA Tesla V100 GPUs with 32 CPU cores.
Software Dependencies	No	The paper mentions 'The code is implemented by Py Torch' but does not provide specific version numbers for PyTorch or any other software libraries or dependencies.
Experiment Setup	Yes	Table 1: Hyperparameters for MPE and Batch size from replay buffer for [MLP, MLP] and [GCN, MLP] 1024 Actor s learning rate for [MLP, MLP] and [GCN, MLP] 1e-4 Critic s learning rate for [MLP, MLP] and [GCN, MLP] 1e-3