Learning in Games: Robustness of Fast Convergence

Authors: Dylan J. Foster, Zhiyuan Li, Thodoris Lykouris, Karthik Sridharan, Eva Tardos

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We show that learning algorithms satisfying a low approximate regret property experience fast convergence to approximate optimality in a large class of repeated games. Our property, which simply requires that each learner has small regret compared to a (1 + )-multiplicative approximation to the best action in hindsight, is ubiquitous among learning algorithms; it is satisfied even by the vanilla Hedge forecaster. Our results improve upon recent work of Syrgkanis et al. [28] in a number of ways. We require only that players observe payoffs under other players realized actions, as opposed to expected payoffs. We further show that convergence occurs with high probability, and show convergence under bandit feedback. Finally, we improve upon the speed of convergence by a factor of n, the number of players.
Researcher Affiliation Academia Cornell University {djfoster,teddlyk,sridharan,eva}@cs.cornell.edu. Tsinghua University, lizhiyuan13@mails.tsinghua.edu.cn.
Pseudocode Yes Algorithm 3: Initialize w1 to the uniform distribution. On each round t, perform update: Algorithm 3 update: wt st 1 = wt 1 st 1 + ct st 1 + γwt 1 and ∀j 6= st 1 wt j = wt j 1 + γwt j 1 where γ ≥ 0 is chosen so that wt is a valid probability distribution.
Open Source Code No The paper does not provide any links to open-source code or state that its code will be made publicly available. It only mentions "sharing his simulation software" by another researcher in the acknowledgements.
Open Datasets No The paper is theoretical and does not describe any experiments that would involve training a model on a dataset. Therefore, no information about publicly available datasets is provided.
Dataset Splits No The paper is theoretical and does not conduct experiments, therefore no dataset splits for training, validation, or testing are mentioned.
Hardware Specification No The paper is theoretical and does not describe any computational experiments that would require specific hardware, therefore no hardware specifications are provided.
Software Dependencies No The paper is theoretical and does not specify any software dependencies with version numbers for experimental setup or reproduction.
Experiment Setup No The paper focuses on theoretical analysis and algorithm design, not experimental implementation. Therefore, no experimental setup details such as hyperparameters or training configurations are provided.