reproducibilityindex.ai

Towards Sharper Risk Bounds for Minimax Problems

Authors: Bowei Zhu, Shaojie Li, Yong Liu

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	For theoretical analysis, current optimal excess risk bounds, which are composed by generalization error and optimization error, present 1/n-rates in stronglyconvex-strongly-concave (SC-SC) settings. Existing studies mainly focus on minimax problems with specific algorithms for optimization error, with only a few studies on generalization performance, which limit better excess risk bounds. In this paper, we study the generalization bounds measured by the gradients of primal functions using uniform localized convergence. We obtain a sharper high probability generalization error bound for nonconvex-strongly-concave (NC-SC) stochastic minimax problems. Furthermore, we provide dimension-independent results under Polyak Lojasiewicz condition for the outer layer. Based on our generalization error bound, we analyze some popular algorithms such as empirical saddle point (ESP), gradient descent ascent (GDA) and stochastic gradient descent ascent (SGDA). We derive better excess primal risk bounds with further reasonable assumptions, which, to the best of our knowledge, are n times faster than exist results in minimax problems.
Researcher Affiliation	Academia	Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 2Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China
Pseudocode	Yes	Algorithm 1 Two-timescale GDA for minimax problem; Algorithm 2 Two-timescale SGDA for minimax problem
Open Source Code	No	The paper does not contain any statements about releasing code, nor does it provide links to a code repository for the methodology described.
Open Datasets	No	The paper is theoretical and focuses on bounds and algorithms, rather than conducting empirical experiments with datasets. Therefore, it does not mention specific training datasets or their public availability.
Dataset Splits	No	The paper is theoretical and does not involve empirical experiments with data, thus no training, validation, or test splits are discussed.
Hardware Specification	No	The paper is theoretical and does not describe any empirical experiments, hence no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not include details about specific software dependencies or version numbers required to reproduce any empirical experiments.
Experiment Setup	No	The paper is theoretical and does not describe any empirical experimental setup details, hyperparameters, or training configurations.