Learning Mean-Field Games
Authors: Xin Guo, Anran Hu, Renyuan Xu, Junzi Zhang
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments on repeated Ad auction problems demonstrate that this GMF-Q algorithm is efficient and robust in terms of convergence and learning accuracy. |
| Researcher Affiliation | Academia | Xin Guo University of California, Berkeley xinguo@berkeley.edu Anran Hu University of California, Berkeley anran_hu@berkeley.edu Renyuan Xu University of California, Berkeley renyuanxu@berkeley.edu Junzi Zhang Stanford University junziz@stanford.edu |
| Pseudocode | Yes | Algorithm 1 Q-learning for GMFGs (GMF-Q) |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the described methodology. |
| Open Datasets | No | The paper describes a simulated repeated Ad auction game with specific parameters, rather than using a publicly available or open dataset. No concrete access information for a dataset is provided. |
| Dataset Splits | No | The paper describes a simulated environment but does not specify training, validation, or test dataset splits in the context of typical machine learning dataset partitioning. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | Parameters. The model parameters are set as: |S| = |A| = 10, the overbidding penalty ρ = 0.2, the distributions of the conversion rate v uniform({1, 2, 3, 4}), and the competition intensity index M = 5. The random fulfillment is chosen as: if s < smax, (s) = 1 with probability 1/2 and (s) = 0 with probability 1/2; if s = smax, (s) = 0. The algorithm parameters are (unless otherwise specified): the temperature parameter c = 4.0, the discount factor γ = 0.8, the parameter h from Lemma 8 in the Appendix being h = 0.87, and the baseline inner iteration being 2000. |