No-Regret Learning in Time-Varying Zero-Sum Games
Authors: Mengxiao Zhang, Peng Zhao, Haipeng Luo, Zhi-Hua Zhou
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results further validate the effectiveness of our algorithm. We also conduct empirical studies to further support our theoretical findings. |
| Researcher Affiliation | Academia | 1University of Southern California 2National Key Laboratory for Novel Software Technology, Nanjing University. |
| Pseudocode | Yes | Algorithm 1 Algorithm for the x-player |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | No | We construct an environment such that PT = Θ(T), WT = Θ(T 3 4 ), and VT = Θ(T). |
| Dataset Splits | No | The paper describes a simulated environment and does not mention explicit training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper mentions implementing the algorithm but does not specify software names with version numbers for reproducibility. |
| Experiment Setup | Yes | We set the size of game matrix to be m n with m = 2 and n = 2. The total time horizon is set as T = 2 106. [...] We implement Algorithm 1 for x-player and Algorithm 2 for y-player with L = 4 and step size pool ηi = 2i 1 T for both players. The number of base-learners (i.e., the size of step size pool) is N = 1 2 log2 T + 1 = 11. |