Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
On the Convergence of No-Regret Learning Dynamics in Time-Varying Games
Authors: Ioannis Anagnostides, Ioannis Panageas, Gabriele Farina, Tuomas Sandholm
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, although the focus of this paper is theoretical, in this section we provide some illustrative experimental examples. In particular, Appendix B.1 contains experiments on time-varying potential games, while Appendix B.2 focuses on time-varying (two-player) zero-sum games. |
| Researcher Affiliation | Collaboration | Ioannis Anagnostides Carnegie Mellon University EMAIL Ioannis Panageas University of California Irvine EMAIL Gabriele Farina MIT EMAIL Tuomas Sandholm Carnegie Mellon University Strategic Machine, Inc. Strategy Robot, Inc. Optimized Markets, Inc. EMAIL |
| Pseudocode | No | No |
| Open Source Code | No | No |
| Open Datasets | No | No |
| Dataset Splits | No | No |
| Hardware Specification | No | No |
| Software Dependencies | No | No |
| Experiment Setup | Yes | In our first experiment, we first sampled two matrices A, P Rdx dy, where dx = dy = 1000. Then, we defined each payoff matrix as A(t) := A(t 1) + Pt α for t 1, where A(0) := A. Here, α > 0 is a parameter that controls the variation of the payoff matrices. In this time-varying setup, we let each player employ (online) GD with learning rate η := 0.1. |