Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications

Authors: Sarah Perrin, Julien Perolat, Mathieu Lauriere, Matthieu Geist, Romuald Elie, Olivier Pietquin

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental These theoretical contributions are supported by numerical experiments provided in either model-based or model-free settings. We provide hereby for the first time converging learning dynamics for Mean Field Games in the presence of common noise.
Researcher Affiliation Collaboration Sarah Perrin ,1, Julien Perolat ,2, Mathieu Laurière3, Matthieu Geist4, Romuald Elie2, Olivier Pietquin4 Univ. Lille, CNRS, Inria, UMR 9189 CRISt AL1 Deep Mind Paris2 Princeton University, ORFE3 Google Research, Brain Team4
Pseudocode Yes Algorithm 1: Fictitious Play in Mean Field Games input :Start with an initial policy 0, an initial distribution µ0 and define 0 = 0 1 for j = 1, . . . , J: do 2 find j a best response against µj (either with Q-learning or with backward induction); 3 compute j the average of ( 0, . . . , j); 4 compute µ j (either with a model-free or model-based method); 5 compute µj the average of (µ0, . . . , µ j) 6 return J, µJ
Open Source Code No The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper.
Open Datasets No The paper describes custom environments for its experiments (Linear Quadratic Mean Field Game, The Beach Bar Process) but does not provide concrete access information (specific link, DOI, repository name, formal citation with authors/year, or reference to established benchmark datasets) for a publicly available or open dataset.
Dataset Splits No The paper describes simulation-based experiments in custom environments and does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning.
Hardware Specification No The paper states 'Experiments presented in this paper were carried out using the Grid 5000 testbed,' but does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions algorithms like Q-learning and Backward Induction, and parameters like learning rate and exploration parameter, but does not specify any software dependencies with version numbers (e.g., library names with specific versions like PyTorch 1.9 or Python 3.8).
Experiment Setup Yes Experimental setup: We consider a Linear Quadratic MFG with 100 states and an horizon N = 30... we set σ = 3, n = 0.1, K = 1, q = 0.01, = 0.5 and cterm = 1. In all the experiments, we set the learning rate of Q-learning to 0.1 and the "-greedy exploration parameter to 0.2.