Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Fictitious Play and Best-Response Dynamics in Identical Interest and Zero-Sum Stochastic Games
Authors: Lucas Baudin, Rida Laraki
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper proposes an extension of a popular decentralized discrete-time learning procedure when repeating a static game called fictitious play (FP) (Brown, 1951; Robinson, 1951) to a dynamic model called discounted stochastic game (Shapley, 1953). Our family of discrete-time FP procedures is proven to converge to the set of stationary Nash equilibria in identical interest discounted stochastic games. This extends similar convergence results for static games (Monderer & Shapley, 1996a). We then analyze the continuous-time counterpart of our FP procedures, which include as a particular case the best-response dynamic introduced and studied by Leslie et al. (2020) in the context of zero-sum stochastic games. We prove the converge of this dynamics to stationary Nash equilibria in identical-interest and zero-sum discounted stochastic games. Thanks to stochastic approximations, we can infer from the continuous-time convergence some discrete time results such as the convergence to stationary equilibria in zero sum and team stochastic games (Holler, 2020). ... Contributions. Our contributions are as follows: We define procedures to play stochastic games in discrete time combining ideas from fictitious play and Q-learning. We prove their convergence to the set of stationary Nash equilibria in identical interest stochastic games. The proof is given directly in discrete time. We define the continuous-time counterpart of our procedures. ... We prove the convergence of our continuous-time dynamics to the set of stationary equilibria in identical interest and in zero-sum stochastic games. The convergence in continuous time allows us to prove some new convergence results in discrete time for zero-sum and team stochastic games (Holler, 2020). |
| Researcher Affiliation | Academia | 1Universit e Paris-Dauphine PSL, France 2University of Liverpool, United Kingdom. Correspondence to: Lucas Baudin <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 FP with Doubling Trick for Zero-Sum Games |
| Open Source Code | No | The paper does not provide any statement or link indicating the release of open-source code for the described methodology. |
| Open Datasets | No | This is a theoretical paper focused on proving convergence of algorithms in game theory; it does not involve the use of empirical datasets for training. |
| Dataset Splits | No | This is a theoretical paper focused on proving convergence of algorithms in game theory; it does not involve data splits for validation or training. |
| Hardware Specification | No | The paper is theoretical and focuses on mathematical proofs and algorithm design; it does not describe any experimental setup that would require specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical, focusing on mathematical concepts and algorithm proofs. It does not mention any specific software dependencies with version numbers for implementation or experimental purposes. |
| Experiment Setup | No | The paper is theoretical and focuses on mathematical proofs and algorithm design; it does not describe an experimental setup with hyperparameters or system-level training settings. |