On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning
Authors: Weichao Mao, Lin Yang, Kaiqing Zhang, Tamer Basar
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we provide numerical simulations to corroborate our theoretical findings. |
| Researcher Affiliation | Collaboration | 1Department of Electrical and Computer Engineering & Coordinated Science Laboratory, University of Illinois Urbana Champaign. 2Department of Electrical and Computer Engineering, University of California, Los Angeles. Part of this work done while the author was visiting Deep Mind. 3Laboratory for Information & Decision Systems, Massachusetts Institute of Technology. Part of this work done while the author was visiting Simons Institute for the Theory of Computing. |
| Pseudocode | Yes | Algorithm 1: Stage-Based V-Learning for CCE (agent i) |
| Open Source Code | No | The paper does not include any explicit statements about making the source code open, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We use a classic matrix team example from the literature (Claus & Boutilier, 1998; Lauer & Riedmiller, 2000)... |
| Dataset Splits | No | The paper describes episodic reinforcement learning settings with episodes and steps (e.g., 'K = 50000 episodes, each episode containing H = 10 steps') but does not specify fixed train/validation/test data splits as would be typical for a supervised learning setup. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory used for running its experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks like Python, PyTorch, or TensorFlow versions) that were used for the experiments. |
| Experiment Setup | Yes | We run Algorithm 3 on this task for T = 5000 rounds, and we set the step size ηt = 10 4 and the momentum parameter at = 0.5. |