Mean Field Multi-Agent Reinforcement Learning
Authors: Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, Jun Wang
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on Gaussian squeeze, Ising model, and battle games justify the learning effectiveness of our mean field approaches. |
| Researcher Affiliation | Academia | 1University College London, London, United Kingdom. 2Shanghai Jiao Tong University, Shanghai, China. Correspondence to: Jun Wang <j.wang@cs.ucl.ac.uk>, Yaodong Yang <yaodong.yang@cs.ucl.ac.uk>. |
| Pseudocode | Yes | We illustrate the MF-Q iterations in Fig. 2, and present the pesudocode for both MF-Q and MF-AC in Appendix A. |
| Open Source Code | No | The paper does not provide explicit statements or links for its own open-source code. |
| Open Datasets | Yes | In the Gaussian Squeeze (GS) task (Holmes Parker et al., 2014)... The Battle game in the Open-source MAgent system (Zheng et al., 2018). |
| Dataset Splits | No | The paper describes experiments in simulated environments/tasks (Gaussian Squeeze, Ising Model, Battle Game) rather than using pre-defined datasets with explicit training/validation/test splits. Therefore, no specific dataset split information for validation is provided. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency versions (e.g., library names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | In the Gaussian Squeeze (GS) task (Holmes Parker et al., 2014)... with µ = 400 and σ = 200. We train all four models by 2000 rounds self-plays. Critically, MF-Q finds a similar Curie temperature (the phase change point) as MCMC that is τ = 1.2. |