Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
Authors: Fivos Kalogiannis, Ioannis Anagnostides, Ioannis Panageas, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Vaggos Chatziafratis, Stelios Andrew Stavroulakis
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our main contribution is the first algorithm for computing stationary ϵ-approximate Nash equilibria in adversarial team Markov games with computational complexity that is polynomial in all the natural parameters of the game, as well as 1/ϵ. The proposed algorithm is based on performing independent policy gradient steps for each player in the team, in tandem with best responses from the side of the adversary; in turn, the policy for the adversary is then obtained by solving a carefully constructed linear program. Our analysis leverages non-standard techniques to establish the KKT optimality conditions for a nonlinear program with nonconvex constraints, thereby leading to a natural interpretation of the induced Lagrange multipliers. |
| Researcher Affiliation | Academia | Fivos Kalogiannis UC Irvine Ioannis Anagnostides Carnegie Mellon University Ioannis Panageas UC Irvine Emmanouil V. Vlatakis-Gkaragkounis Columbia University Vaggos Chatziafratis UC Santa Cruz Stelios Stavroulakis UC Irvine |
| Pseudocode | Yes | Algorithm 1 Independent Policy Gradient Max (IPGMAX) |
| Open Source Code | No | The paper does not contain any statement about making its source code publicly available or providing a link to a code repository. |
| Open Datasets | No | The paper describes a theoretical framework and algorithm and does not include empirical experiments or discuss datasets. |
| Dataset Splits | No | The paper focuses on theoretical contributions and does not describe experiments with dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not mention any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not specify software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with specific details such as hyperparameters or training configurations. |