An Adaptive Entropy-Regularization Framework for Multi-Agent Reinforcement Learning

Authors: Woojun Kim, Youngchul Sung

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that the proposed scheme significantly outperforms current state-of-the-art multi-agent RL algorithms. and 4. Experiments Here, we provide numerical results and ablation studies.
Researcher Affiliation Academia 1School of Electrical Engineering, KAIST, Daejeon 34141, Republic of Korea.
Pseudocode Yes Algorithm 1 ADaptive Entropy-Regularization for multi-agent reinforcement learning (ADER)
Open Source Code Yes The source code is available at https: //github.com/wjkim1202/ader.
Open Datasets Yes Multi-agent Half Cheetah (Peng et al., 2021), Heterogeneous Predator-Prey (HPP), Starcraft II micromanagement benchmark (SMAC) environment (Samvelyan et al., 2019).
Dataset Splits No The paper uses standard benchmark environments (Half Cheetah, HPP, SMAC) but does not explicitly provide details on train/validation/test dataset splits, such as specific percentages, sample counts, or clear partitioning methodology for reproducibility.
Hardware Specification Yes We conducted the experiments on a server with Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz and 8 Nvidia Titan xp GPUs.
Software Dependencies No The paper mentions software like Pytorch and specific environments (SMAC, GRF) but does not provide specific version numbers for any software libraries or dependencies used for implementation.
Experiment Setup Yes We use an MLP with 2 hidden layers which have 400 and 300 hidden units and Re LU activation functions. The replay buffer stores up to 10^6 transitions and 100 transitions are uniformly sampled for training. We set the hyperparameter for EMA filter as ξ = 0.9 and initialize the temperature parameters as αi init = e^ 2 for all i N. and Table 1 with specific hyperparameters for SMAC.