Learning in Games with Lossy Feedback
Authors: Zhengyuan Zhou, Panayotis Mertikopoulos, Susan Athey, Nicholas Bambos, Peter W. Glynn, Yinyu Ye
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose a simple variant of the classical online gradient descent algorithm, called reweighted online gradient descent (ROGD) and show that in variationally stable games, if each agent adopts ROGD, then almost sure convergence to the set of Nash equilibria is guaranteed, even when the feedback loss is asynchronous and arbitrarily corrrelated among agents. We then extend the framework to deal with unknown feedback loss probabilities by using an estimator (constructed from past data) in its replacement. Finally, we further extend the framework to accomodate both asynchronous loss and stochastic rewards and establish that multi-agent ROGD learning still converges to the set of Nash equilibria in such settings. |
| Researcher Affiliation | Academia | Zhengyuan Zhou Stanford University zyzhou@stanford.edu Panayotis Mertikopoulos Univ. Grenoble Alpes, CNRS, Inria, LIG panayotis.mertikopoulos@imag.fr Susan Athey Stanford University athey@stanford.edu Nicholas Bambos Stanford University bambos@stanford.edu Peter Glynn Stanford University glynn@stanford.edu Yinyu Ye Stanford University yinyu-ye@stanford.edu |
| Pseudocode | Yes | Algorithm 1: Multi-Agent OGD Learning; Algorithm 2: Multi-Agent OGD Learning under Asynchronous Feedback Loss; Algorithm 3: Multi-Agent ROGD Learning under Asynchronous Feedback Loss |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating the release of open-source code for the described methodology. |
| Open Datasets | No | This paper is theoretical and does not involve training on datasets; thus, there is no mention of publicly available datasets for training. |
| Dataset Splits | No | This paper is theoretical and does not involve empirical experiments with datasets, so it does not discuss training/validation/test dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe experiments run on specific hardware, thus no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not detail software dependencies with version numbers needed for replication. |
| Experiment Setup | No | This is a theoretical paper that does not include an experimental setup with specific details like hyperparameters or system-level training settings. |