Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
$O(T^{-1})$ Convergence of Optimistic-Follow-the-Regularized-Leader in Two-Player Zero-Sum Markov Games
Authors: Yuepeng Yang, Cong Ma
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We prove that optimistic-follow-the-regularized-leader (OFTRL), together with smooth value updates, finds an O(T 1)-approximate Nash equilibrium in T iterations for twoplayer zero-sum Markov games with full information. This improves the O(T 5/6) convergence rate recently shown in the paper by Zhang et al. (2022b). The refined analysis hinges on two essential ingredients. |
| Researcher Affiliation | Academia | Department of Statistics, University of Chicago; Email: EMAIL |
| Pseudocode | Yes | Algorithm 1 Optimistic-follow-the-regularized-leader for solving two-player zero-sum Markov games |
| Open Source Code | No | The paper does not provide any statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments on datasets, so it does not specify any dataset for training. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments on datasets, so it does not provide dataset split information. |
| Hardware Specification | No | The paper is theoretical and does not describe experiments that would require specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not describe experiments that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with hyperparameters or training settings. |