Scalable Deep Reinforcement Learning Algorithms for Mean Field Games

Authors: Mathieu Lauriere, Sarah Perrin, Sertan Girgin, Paul Muller, Ayush Jain, Theophile Cabannes, Georgios Piliouras, Julien Perolat, Romuald Elie, Olivier Pietquin, Matthieu Geist

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate numerically that these methods efficiently enable the use of Deep RL algorithms to solve various MFGs. In addition, we show that these methods outperform Sot A baselines from the literature.
Researcher Affiliation Collaboration 1NYU Shanghai, China 2Univ. Lille, CNRS, Inria, Centrale Lille, UMR 9189 CRISt AL, France 3Google Research 4Deep Mind 5UC Berkeley, California, USA 6Singapore University of Technology and Design, Singapore.
Pseudocode Yes Algorithm 1 D-AFP, Algorithm 2 D-MOMD, Algorithm 3 Forward update for the distribution, Algorithm 4 Backward induction for the value function evaluation, Algorithm 5 Backward induction for the optimal value function, Algorithm 6 Banach-Picard (BP) fixed point, Algorithm 7 Fictitious Play (FP), Algorithm 8 Policy Iteration (PI), Algorithm 9 Online Mirror Descent (OMD).
Open Source Code Yes The code for Deep Munchausen OMD is available in Open Spiel (Lanctot et al., 2019).1 See https://github.com/deepmind/open_spiel/blob/master/open_spiel/python/mfg/ algorithms/munchausen_deep_mirror_descent. py.
Open Datasets No The paper describes several models/environments (Epidemics, Linear-Quadratic MFG, Exploration, Crowd modeling with congestion, Multi-population chasing) but does not provide concrete access information (link, DOI, repository, or formal citation with authors/year) for publicly available datasets used for training.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology).
Hardware Specification No No specific hardware details (GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running experiments are provided.
Software Dependencies No The paper mentions 'Open Spiel' and uses deep RL methods like DQN, but it does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes For this example and the following ones, we display the best exploitability curves obtained for each method after running sweeps over hyperparameters. See Appx. D for some instances of sweeps for D-MOMD. (Figure 7 shows specific hyperparameters tested: learning_rate = 0.001, 0.01, 0.05, 0.1 and tau = 1, 5, 10, 50).