reproducibilityindex.ai

Scalable Deep Reinforcement Learning Algorithms for Mean Field Games

Authors: Mathieu Lauriere, Sarah Perrin, Sertan Girgin, Paul Muller, Ayush Jain, Theophile Cabannes, Georgios Piliouras, Julien Perolat, Romuald Elie, Olivier Pietquin, Matthieu Geist

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate numerically that these methods efﬁciently enable the use of Deep RL algorithms to solve various MFGs. In addition, we show that these methods outperform Sot A baselines from the literature.
Researcher Affiliation	Collaboration	1NYU Shanghai, China 2Univ. Lille, CNRS, Inria, Centrale Lille, UMR 9189 CRISt AL, France 3Google Research 4Deep Mind 5UC Berkeley, California, USA 6Singapore University of Technology and Design, Singapore.
Pseudocode	Yes	Algorithm 1 D-AFP, Algorithm 2 D-MOMD, Algorithm 3 Forward update for the distribution, Algorithm 4 Backward induction for the value function evaluation, Algorithm 5 Backward induction for the optimal value function, Algorithm 6 Banach-Picard (BP) ﬁxed point, Algorithm 7 Fictitious Play (FP), Algorithm 8 Policy Iteration (PI), Algorithm 9 Online Mirror Descent (OMD).
Open Source Code	Yes	The code for Deep Munchausen OMD is available in Open Spiel (Lanctot et al., 2019).1 See https://github.com/deepmind/open_spiel/blob/master/open_spiel/python/mfg/ algorithms/munchausen_deep_mirror_descent. py.
Open Datasets	No	The paper describes several models/environments (Epidemics, Linear-Quadratic MFG, Exploration, Crowd modeling with congestion, Multi-population chasing) but does not provide concrete access information (link, DOI, repository, or formal citation with authors/year) for publicly available datasets used for training.
Dataset Splits	No	The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology).
Hardware Specification	No	No specific hardware details (GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running experiments are provided.
Software Dependencies	No	The paper mentions 'Open Spiel' and uses deep RL methods like DQN, but it does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	Yes	For this example and the following ones, we display the best exploitability curves obtained for each method after running sweeps over hyperparameters. See Appx. D for some instances of sweeps for D-MOMD. (Figure 7 shows specific hyperparameters tested: learning_rate = 0.001, 0.01, 0.05, 0.1 and tau = 1, 5, 10, 50).