Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games
Authors: Tanner Fiez, Lillian Ratliff, Eric Mazumdar, Evan Faulkner, Adhyyan Narang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study gradient descent-ascent learning dynamics with timescale separation (τ-GDA) in unconstrained continuous action zero-sum games where the minimizing player faces a nonconvex optimization problem and the maximizing player optimizes a Polyak-Łojasiewicz (PŁ) or strongly-concave (SC) objective. In contrast to past work on gradient-based learning in nonconvex-PŁ/SC zero-sum games, we assess convergence in relation to natural game-theoretic equilibria instead of only notions of stationarity. In pursuit of this goal, we prove that the only locally stable points of the τ-GDA continuous-time limiting system correspond to strict local minmax equilibria in each class of games. For these classes of games, we exploit timescale separation to construct a potential function that when combined with the stability characterization and an asymptotic saddle avoidance result gives a global asymptotic almost-sure convergence guarantee for the discrete-time gradient descent-ascent update to a set of the strict local minmax equilibrium. Moreover, we provide convergence rates for the gradient descent-ascent dynamics with timescale separation to approximate stationary points. |
| Researcher Affiliation | Academia | Tanner Fiez University of Washington fiezt@uw.edu Lillian J. Ratliff University of Washington ratliffl@uw.edu Eric Mazumdar California Institute of Technology mazumdar@caltech.edu Evan Faulkner University of Washington evanjf5@uw.edu Adhyyan Narang University of Washington adhyyan@uw.edu |
| Pseudocode | Yes | Algorithm 1 τ-GDA Input: x0 Rd1, y0 Rd2 for k = 0, 1, . . . do xk+1 xk γ 1f(xk, yk) yk+1 yk + γτ 2f(xk, yk) end for Algorithm 2 Stochastic τ-GDA Input: x0 Rd1, y0 Rd2 for k = 0, 1, . . . do xk+1 xk γg1(xk, yk; θ1,k) yk+1 yk + γτg2(xk, yk; θ2,k) end for |
| Open Source Code | No | The paper states 'N/A' for including code in the author checklist and provides no explicit links or statements about code availability for their work in the main text. |
| Open Datasets | No | The paper is theoretical and does not describe experiments involving datasets. The author checklist states 'N/A' for experimental results. |
| Dataset Splits | No | The paper is theoretical and does not describe experiments involving data splits. The author checklist states 'N/A' for experimental results. |
| Hardware Specification | No | The paper is theoretical and does not describe any experiments that would require specific hardware specifications. The author checklist states 'N/A' for experimental results. |
| Software Dependencies | No | The paper is theoretical and does not describe experiments or their software dependencies. The author checklist states 'N/A' for experimental results. |
| Experiment Setup | No | The paper is theoretical and does not describe experimental setups, hyperparameters, or training configurations. The author checklist states 'N/A' for experimental results. |