Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games
Authors: Tanner Fiez, Lillian Ratliff, Eric Mazumdar, Evan Faulkner, Adhyyan Narang
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study gradient descent-ascent learning dynamics with timescale separation (τ-GDA) in unconstrained continuous action zero-sum games where the minimizing player faces a nonconvex optimization problem and the maximizing player optimizes a Polyak-Łojasiewicz (PŁ) or strongly-concave (SC) objective. In contrast to past work on gradient-based learning in nonconvex-PŁ/SC zero-sum games, we assess convergence in relation to natural game-theoretic equilibria instead of only notions of stationarity. In pursuit of this goal, we prove that the only locally stable points of the τ-GDA continuous-time limiting system correspond to strict local minmax equilibria in each class of games. For these classes of games, we exploit timescale separation to construct a potential function that when combined with the stability characterization and an asymptotic saddle avoidance result gives a global asymptotic almost-sure convergence guarantee for the discrete-time gradient descent-ascent update to a set of the strict local minmax equilibrium. Moreover, we provide convergence rates for the gradient descent-ascent dynamics with timescale separation to approximate stationary points. |
| Researcher Affiliation | Academia | Tanner Fiez University of Washington EMAIL Lillian J. Ratliff University of Washington EMAIL Eric Mazumdar California Institute of Technology EMAIL Evan Faulkner University of Washington EMAIL Adhyyan Narang University of Washington EMAIL |
| Pseudocode | Yes | Algorithm 1 τ-GDA Input: x0 Rd1, y0 Rd2 for k = 0, 1, . . . do xk+1 xk γ 1f(xk, yk) yk+1 yk + γτ 2f(xk, yk) end for Algorithm 2 Stochastic τ-GDA Input: x0 Rd1, y0 Rd2 for k = 0, 1, . . . do xk+1 xk γg1(xk, yk; θ1,k) yk+1 yk + γτg2(xk, yk; θ2,k) end for |
| Open Source Code | No | The paper states 'N/A' for including code in the author checklist and provides no explicit links or statements about code availability for their work in the main text. |
| Open Datasets | No | The paper is theoretical and does not describe experiments involving datasets. The author checklist states 'N/A' for experimental results. |
| Dataset Splits | No | The paper is theoretical and does not describe experiments involving data splits. The author checklist states 'N/A' for experimental results. |
| Hardware Specification | No | The paper is theoretical and does not describe any experiments that would require specific hardware specifications. The author checklist states 'N/A' for experimental results. |
| Software Dependencies | No | The paper is theoretical and does not describe experiments or their software dependencies. The author checklist states 'N/A' for experimental results. |
| Experiment Setup | No | The paper is theoretical and does not describe experimental setups, hyperparameters, or training configurations. The author checklist states 'N/A' for experimental results. |