No-Regret Learning in Time-Varying Zero-Sum Games

Authors: Mengxiao Zhang, Peng Zhao, Haipeng Luo, Zhi-Hua Zhou

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results further validate the effectiveness of our algorithm. We also conduct empirical studies to further support our theoretical findings.
Researcher Affiliation Academia 1University of Southern California 2National Key Laboratory for Novel Software Technology, Nanjing University.
Pseudocode Yes Algorithm 1 Algorithm for the x-player
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets No We construct an environment such that PT = Θ(T), WT = Θ(T 3 4 ), and VT = Θ(T).
Dataset Splits No The paper describes a simulated environment and does not mention explicit training, validation, or test dataset splits.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No The paper mentions implementing the algorithm but does not specify software names with version numbers for reproducibility.
Experiment Setup Yes We set the size of game matrix to be m n with m = 2 and n = 2. The total time horizon is set as T = 2 106. [...] We implement Algorithm 1 for x-player and Algorithm 2 for y-player with L = 4 and step size pool ηi = 2i 1 T for both players. The number of base-learners (i.e., the size of step size pool) is N = 1 2 log2 T + 1 = 11.