Fictitious Play and Best-Response Dynamics in Identical Interest and Zero-Sum Stochastic Games
Authors: Lucas Baudin, Rida Laraki
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper proposes an extension of a popular decentralized discrete-time learning procedure when repeating a static game called fictitious play (FP) (Brown, 1951; Robinson, 1951) to a dynamic model called discounted stochastic game (Shapley, 1953). Our family of discrete-time FP procedures is proven to converge to the set of stationary Nash equilibria in identical interest discounted stochastic games. This extends similar convergence results for static games (Monderer & Shapley, 1996a). We then analyze the continuous-time counterpart of our FP procedures, which include as a particular case the best-response dynamic introduced and studied by Leslie et al. (2020) in the context of zero-sum stochastic games. We prove the converge of this dynamics to stationary Nash equilibria in identical-interest and zero-sum discounted stochastic games. Thanks to stochastic approximations, we can infer from the continuous-time convergence some discrete time results such as the convergence to stationary equilibria in zero sum and team stochastic games (Holler, 2020). ... Contributions. Our contributions are as follows: We define procedures to play stochastic games in discrete time combining ideas from fictitious play and Q-learning. We prove their convergence to the set of stationary Nash equilibria in identical interest stochastic games. The proof is given directly in discrete time. We define the continuous-time counterpart of our procedures. ... We prove the convergence of our continuous-time dynamics to the set of stationary equilibria in identical interest and in zero-sum stochastic games. The convergence in continuous time allows us to prove some new convergence results in discrete time for zero-sum and team stochastic games (Holler, 2020). |
| Researcher Affiliation | Academia | 1Universit e Paris-Dauphine PSL, France 2University of Liverpool, United Kingdom. Correspondence to: Lucas Baudin <lucas.baudin@dauphine.eu>. |
| Pseudocode | Yes | Algorithm 1 FP with Doubling Trick for Zero-Sum Games |
| Open Source Code | No | The paper does not provide any statement or link indicating the release of open-source code for the described methodology. |
| Open Datasets | No | This is a theoretical paper focused on proving convergence of algorithms in game theory; it does not involve the use of empirical datasets for training. |
| Dataset Splits | No | This is a theoretical paper focused on proving convergence of algorithms in game theory; it does not involve data splits for validation or training. |
| Hardware Specification | No | The paper is theoretical and focuses on mathematical proofs and algorithm design; it does not describe any experimental setup that would require specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical, focusing on mathematical concepts and algorithm proofs. It does not mention any specific software dependencies with version numbers for implementation or experimental purposes. |
| Experiment Setup | No | The paper is theoretical and focuses on mathematical proofs and algorithm design; it does not describe an experimental setup with hyperparameters or system-level training settings. |