Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Mean Field Multi-Agent Reinforcement Learning
Authors: Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, Jun Wang
ICML 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on Gaussian squeeze, Ising model, and battle games justify the learning effectiveness of our mean field approaches. |
| Researcher Affiliation | Academia | 1University College London, London, United Kingdom. 2Shanghai Jiao Tong University, Shanghai, China. Correspondence to: Jun Wang <EMAIL>, Yaodong Yang <EMAIL>. |
| Pseudocode | Yes | We illustrate the MF-Q iterations in Fig. 2, and present the pesudocode for both MF-Q and MF-AC in Appendix A. |
| Open Source Code | No | The paper does not provide explicit statements or links for its own open-source code. |
| Open Datasets | Yes | In the Gaussian Squeeze (GS) task (Holmes Parker et al., 2014)... The Battle game in the Open-source MAgent system (Zheng et al., 2018). |
| Dataset Splits | No | The paper describes experiments in simulated environments/tasks (Gaussian Squeeze, Ising Model, Battle Game) rather than using pre-defined datasets with explicit training/validation/test splits. Therefore, no specific dataset split information for validation is provided. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency versions (e.g., library names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | In the Gaussian Squeeze (GS) task (Holmes Parker et al., 2014)... with µ = 400 and σ = 200. We train all four models by 2000 rounds self-plays. Critically, MF-Q finds a similar Curie temperature (the phase change point) as MCMC that is τ = 1.2. |