Generalization in Mean Field Games by Learning Master Policies
Authors: Sarah Perrin, Mathieu Laurière, Julien Pérolat, Romuald Élie, Matthieu Geist, Olivier Pietquin9413-9421
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate on numerical examples not only the efficiency of the learned Master policy but also its generalization capabilities beyond the distributions used for training. Last, we provide empirical evidence that not only this method learns the Master policy on a training set of distributions, but that the learned policy generalizes to unseen distributions. |
| Researcher Affiliation | Collaboration | 1 Univ. Lille, CNRS, Inria, Centrale Lille, UMR 9189 CRISt AL 2 Google Research, Brain Team 3 Deep Mind Paris |
| Pseudocode | Yes | Algorithm 1: Master Fictitious Play |
| Open Source Code | No | The paper mentions using a "default implementation of RLlib (Liang et al. 2017)" but does not provide any link or statement about releasing the source code for their specific method or implementation. |
| Open Datasets | No | The paper uses custom environments described as "inspired by Geist et al. (2022)" and "introduced by Perrin et al. (2020)". It defines its own "training set M composed of Gaussian distributions" and "testing set" by describing their generation process, rather than referencing a publicly available dataset with concrete access information (e.g., URL, DOI, specific repository, or standard benchmark citation with authors/year). |
| Dataset Splits | No | The paper refers to a "training set M" and a "testing set" of distributions for its experiments. It describes how these sets are composed (e.g., Gaussian distributions with different means or random distributions). However, it does not specify any quantitative splits (e.g., percentages, sample counts) for training, validation, or testing, nor does it refer to standard, predefined splits for any dataset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as CPU or GPU models, memory, or cloud instance types. It mentions using "Deep RL" and "neural networks" but no hardware specifications are given. |
| Software Dependencies | No | The paper states: "For the numerical results presented below, we used the default implementation of RLlib (Liang et al. 2017)." While RLlib is a software component, its version number is not specified, and no other key software dependencies (e.g., Python, PyTorch, TensorFlow versions) are listed. |
| Experiment Setup | No | The paper describes the environment (1D/2D, state/action space, reward function) and the neural network architecture (e.g., use of Conv Nets for 2D). However, it does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of training steps/epochs), optimizer settings, or other concrete training configurations. |