Learning Graphon Mean Field Games and Approximate Nash Equilibria
Authors: Kai Cui, Heinz Koeppl
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we are able to demonstrate on a number of examples that the finite-agent behavior comes increasingly close to the mean field behavior for our computed equilibria as the graph or system size grows, verifying our theory. |
| Researcher Affiliation | Academia | Kai Cui & Heinz Koeppl Department of Electrical Engineering, Technische Universität Darmstadt, Germany {kai.cui,heinz.koeppl}@bcs.tu-darmstadt.de |
| Pseudocode | Yes | Algorithm 1 Fixed point iteration Algorithm 2 Backwards induction Algorithm 3 Forward simulation Algorithm 4 Sequential Monte Carlo |
| Open Source Code | Yes | For reproducibility, in the supplement we provide all code required to reproduce all results in this work. |
| Open Datasets | No | The paper describes the generation of synthetic graphs ( |
| Dataset Splits | No | The paper does not provide specific dataset split information (e.g., percentages, sample counts for training, validation, or test sets). The experiments are simulations on generated graphs, not based on pre-split datasets. |
| Hardware Specification | No | We ran each trial of our experiments on a single conventional CPU core, with typical wall-clock times reaching up to at most a few days. We estimate the required compute to approximately 6500 core hours. We did not use any GPUs or TPUs. |
| Software Dependencies | Yes | For PPO, we used the RLlib implementation by Liang et al. (2018) (version 1.2.0, Apache-2.0 license). |
| Experiment Setup | Yes | As for the specific configurations used in the PPO experiments, we give the hyperparameters in Table 1 and used with a feedforward neural network policy consisting of two hidden layers with 256 nodes and tanh activations, outputting a softmax policy over all actions. |