Mean Field Multi-Agent Reinforcement Learning

Authors: Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, Jun Wang

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on Gaussian squeeze, Ising model, and battle games justify the learning effectiveness of our mean field approaches.
Researcher Affiliation Academia 1University College London, London, United Kingdom. 2Shanghai Jiao Tong University, Shanghai, China. Correspondence to: Jun Wang <j.wang@cs.ucl.ac.uk>, Yaodong Yang <yaodong.yang@cs.ucl.ac.uk>.
Pseudocode Yes We illustrate the MF-Q iterations in Fig. 2, and present the pesudocode for both MF-Q and MF-AC in Appendix A.
Open Source Code No The paper does not provide explicit statements or links for its own open-source code.
Open Datasets Yes In the Gaussian Squeeze (GS) task (Holmes Parker et al., 2014)... The Battle game in the Open-source MAgent system (Zheng et al., 2018).
Dataset Splits No The paper describes experiments in simulated environments/tasks (Gaussian Squeeze, Ising Model, Battle Game) rather than using pre-defined datasets with explicit training/validation/test splits. Therefore, no specific dataset split information for validation is provided.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper does not provide specific software dependency versions (e.g., library names with version numbers) needed to replicate the experiment.
Experiment Setup Yes In the Gaussian Squeeze (GS) task (Holmes Parker et al., 2014)... with µ = 400 and σ = 200. We train all four models by 2000 rounds self-plays. Critically, MF-Q finds a similar Curie temperature (the phase change point) as MCMC that is τ = 1.2.