Convergence of No-Swap-Regret Dynamics in Self-Play

Authors: Renato Leme, Georgios Piliouras, Jon Schneider

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We prove that in almost all symmetric zero-sum games under symmetric initializations of the agents, no-swap-regret dynamics in self-play are guaranteed to converge in a strong frequent-iterate sense to the Nash equilibrium: in all but a vanishing fraction of the rounds, the players must play a strategy profile close to a symmetric Nash equilibrium. Remarkably, relaxing any of these three constraints, i.e. by allowing either i) asymmetric initial conditions, or ii) an asymmetric game or iii) no-external regret dynamics suffices to destroy this result and lead to complex non-equilibrating or even chaotic behavior.
Researcher Affiliation Industry Renato Paes Leme Google Research Georgios Piliouras Google Deepmind Jon Schneider Google Research
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. The methods are described through mathematical definitions, lemmas, and theorems.
Open Source Code No The NeurIPS Paper Checklist states: 'The paper does not include experiments requiring code.' No explicit statement or link providing access to source code for the methodology described in the paper is found.
Open Datasets No The paper is theoretical and does not involve training models on datasets. Therefore, no information about publicly available or open datasets for training is provided.
Dataset Splits No The paper is theoretical and does not involve empirical experiments with datasets. As such, no information about training/test/validation dataset splits is provided.
Hardware Specification No The paper is theoretical and does not describe any experiments that would require specific hardware. Therefore, no hardware specifications (GPU, CPU models, etc.) are mentioned.
Software Dependencies No The paper is theoretical and does not involve running computational experiments that would necessitate listing specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not present empirical experiments. Therefore, there are no specific experimental setup details such as hyperparameters or system-level training settings.