Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games

Authors: Tanner Fiez, Lillian Ratliff, Eric Mazumdar, Evan Faulkner, Adhyyan Narang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study gradient descent-ascent learning dynamics with timescale separation (τ-GDA) in unconstrained continuous action zero-sum games where the minimizing player faces a nonconvex optimization problem and the maximizing player optimizes a Polyak-Łojasiewicz (PŁ) or strongly-concave (SC) objective. In contrast to past work on gradient-based learning in nonconvex-PŁ/SC zero-sum games, we assess convergence in relation to natural game-theoretic equilibria instead of only notions of stationarity. In pursuit of this goal, we prove that the only locally stable points of the τ-GDA continuous-time limiting system correspond to strict local minmax equilibria in each class of games. For these classes of games, we exploit timescale separation to construct a potential function that when combined with the stability characterization and an asymptotic saddle avoidance result gives a global asymptotic almost-sure convergence guarantee for the discrete-time gradient descent-ascent update to a set of the strict local minmax equilibrium. Moreover, we provide convergence rates for the gradient descent-ascent dynamics with timescale separation to approximate stationary points.
Researcher Affiliation Academia Tanner Fiez University of Washington fiezt@uw.edu Lillian J. Ratliff University of Washington ratliffl@uw.edu Eric Mazumdar California Institute of Technology mazumdar@caltech.edu Evan Faulkner University of Washington evanjf5@uw.edu Adhyyan Narang University of Washington adhyyan@uw.edu
Pseudocode Yes Algorithm 1 τ-GDA Input: x0 Rd1, y0 Rd2 for k = 0, 1, . . . do xk+1 xk γ 1f(xk, yk) yk+1 yk + γτ 2f(xk, yk) end for Algorithm 2 Stochastic τ-GDA Input: x0 Rd1, y0 Rd2 for k = 0, 1, . . . do xk+1 xk γg1(xk, yk; θ1,k) yk+1 yk + γτg2(xk, yk; θ2,k) end for
Open Source Code No The paper states 'N/A' for including code in the author checklist and provides no explicit links or statements about code availability for their work in the main text.
Open Datasets No The paper is theoretical and does not describe experiments involving datasets. The author checklist states 'N/A' for experimental results.
Dataset Splits No The paper is theoretical and does not describe experiments involving data splits. The author checklist states 'N/A' for experimental results.
Hardware Specification No The paper is theoretical and does not describe any experiments that would require specific hardware specifications. The author checklist states 'N/A' for experimental results.
Software Dependencies No The paper is theoretical and does not describe experiments or their software dependencies. The author checklist states 'N/A' for experimental results.
Experiment Setup No The paper is theoretical and does not describe experimental setups, hyperparameters, or training configurations. The author checklist states 'N/A' for experimental results.