$O(T^{-1})$ Convergence of Optimistic-Follow-the-Regularized-Leader in Two-Player Zero-Sum Markov Games
Authors: Yuepeng Yang, Cong Ma
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We prove that optimistic-follow-the-regularized-leader (OFTRL), together with smooth value updates, finds an O(T 1)-approximate Nash equilibrium in T iterations for twoplayer zero-sum Markov games with full information. This improves the O(T 5/6) convergence rate recently shown in the paper by Zhang et al. (2022b). The refined analysis hinges on two essential ingredients. |
| Researcher Affiliation | Academia | Department of Statistics, University of Chicago; Email: {yuepengyang, congm}@uchicago.edu |
| Pseudocode | Yes | Algorithm 1 Optimistic-follow-the-regularized-leader for solving two-player zero-sum Markov games |
| Open Source Code | No | The paper does not provide any statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments on datasets, so it does not specify any dataset for training. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments on datasets, so it does not provide dataset split information. |
| Hardware Specification | No | The paper is theoretical and does not describe experiments that would require specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not describe experiments that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with hyperparameters or training settings. |