Scalable Deep Reinforcement Learning Algorithms for Mean Field Games
Authors: Mathieu Lauriere, Sarah Perrin, Sertan Girgin, Paul Muller, Ayush Jain, Theophile Cabannes, Georgios Piliouras, Julien Perolat, Romuald Elie, Olivier Pietquin, Matthieu Geist
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate numerically that these methods efficiently enable the use of Deep RL algorithms to solve various MFGs. In addition, we show that these methods outperform Sot A baselines from the literature. |
| Researcher Affiliation | Collaboration | 1NYU Shanghai, China 2Univ. Lille, CNRS, Inria, Centrale Lille, UMR 9189 CRISt AL, France 3Google Research 4Deep Mind 5UC Berkeley, California, USA 6Singapore University of Technology and Design, Singapore. |
| Pseudocode | Yes | Algorithm 1 D-AFP, Algorithm 2 D-MOMD, Algorithm 3 Forward update for the distribution, Algorithm 4 Backward induction for the value function evaluation, Algorithm 5 Backward induction for the optimal value function, Algorithm 6 Banach-Picard (BP) fixed point, Algorithm 7 Fictitious Play (FP), Algorithm 8 Policy Iteration (PI), Algorithm 9 Online Mirror Descent (OMD). |
| Open Source Code | Yes | The code for Deep Munchausen OMD is available in Open Spiel (Lanctot et al., 2019).1 See https://github.com/deepmind/open_spiel/blob/master/open_spiel/python/mfg/ algorithms/munchausen_deep_mirror_descent. py. |
| Open Datasets | No | The paper describes several models/environments (Epidemics, Linear-Quadratic MFG, Exploration, Crowd modeling with congestion, Multi-population chasing) but does not provide concrete access information (link, DOI, repository, or formal citation with authors/year) for publicly available datasets used for training. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology). |
| Hardware Specification | No | No specific hardware details (GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running experiments are provided. |
| Software Dependencies | No | The paper mentions 'Open Spiel' and uses deep RL methods like DQN, but it does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | For this example and the following ones, we display the best exploitability curves obtained for each method after running sweeps over hyperparameters. See Appx. D for some instances of sweeps for D-MOMD. (Figure 7 shows specific hyperparameters tested: learning_rate = 0.001, 0.01, 0.05, 0.1 and tau = 1, 5, 10, 50). |