Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Deep Mean Field Games for Modeling Large Population Behavior
Authors: Jiachen Yang, Xiaojing Ye, Rakshit Trivedi, Huan Xu, Hongyuan Zha
ICLR 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our method learns both the reward function and forward dynamics of an MFG from real data, and we report the first empirical test of a mean field game model of a real-world social media population. |
| Researcher Affiliation | Academia | 1Georgia Institute of Technology 2Georgia State University |
| Pseudocode | Yes | Algorithm 1 Guided cost learning; Algorithm 2 Actor-critic algorithm for MFG |
| Open Source Code | No | The paper does not include a statement about releasing source code or a link to a code repository. |
| Open Datasets | No | We use data representing the activity of a Twitter population consisting of 406 users. Data collection proceeded as follows for 27 days... |
| Dataset Splits | No | The paper mentions 'The training set consists of trajectories...over the first M = 21 days' and 'evaluated against data from 6 held-out test days', but does not explicitly detail a separate validation set or its split. |
| Hardware Specification | No | The paper does not specify any hardware details used for running the experiments. |
| Software Dependencies | No | VAR was implemented using the Statsmodels module in Python, with order 18 selected via random sub-sampling validation with validation set size 5 (Seabold & Perktold, 2010). All layers were initialized using the Xavier normal initializer in Tensorflow. |
| Experiment Setup | Yes | Table 1: Parameters S max actor-critic episodes 4000 β critic learning rate O(1/s) ξ actor learning rate O(1/s ln ln s) c αi j scaling factor 1e4 ϵ Adam optimizer learning rate for reward 1e-4 d R convergence threshold for reward iteration 1e-4 θfinal learned policy parameter 8.64 |