reproducibilityindex.ai

Learning in Games with Lossy Feedback

Authors: Zhengyuan Zhou, Panayotis Mertikopoulos, Susan Athey, Nicholas Bambos, Peter W. Glynn, Yinyu Ye

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We propose a simple variant of the classical online gradient descent algorithm, called reweighted online gradient descent (ROGD) and show that in variationally stable games, if each agent adopts ROGD, then almost sure convergence to the set of Nash equilibria is guaranteed, even when the feedback loss is asynchronous and arbitrarily corrrelated among agents. We then extend the framework to deal with unknown feedback loss probabilities by using an estimator (constructed from past data) in its replacement. Finally, we further extend the framework to accomodate both asynchronous loss and stochastic rewards and establish that multi-agent ROGD learning still converges to the set of Nash equilibria in such settings.
Researcher Affiliation	Academia	Zhengyuan Zhou Stanford University zyzhou@stanford.edu Panayotis Mertikopoulos Univ. Grenoble Alpes, CNRS, Inria, LIG panayotis.mertikopoulos@imag.fr Susan Athey Stanford University athey@stanford.edu Nicholas Bambos Stanford University bambos@stanford.edu Peter Glynn Stanford University glynn@stanford.edu Yinyu Ye Stanford University yinyu-ye@stanford.edu
Pseudocode	Yes	Algorithm 1: Multi-Agent OGD Learning; Algorithm 2: Multi-Agent OGD Learning under Asynchronous Feedback Loss; Algorithm 3: Multi-Agent ROGD Learning under Asynchronous Feedback Loss
Open Source Code	No	The paper does not provide any explicit statements or links indicating the release of open-source code for the described methodology.
Open Datasets	No	This paper is theoretical and does not involve training on datasets; thus, there is no mention of publicly available datasets for training.
Dataset Splits	No	This paper is theoretical and does not involve empirical experiments with datasets, so it does not discuss training/validation/test dataset splits.
Hardware Specification	No	The paper is theoretical and does not describe experiments run on specific hardware, thus no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not detail software dependencies with version numbers needed for replication.
Experiment Setup	No	This is a theoretical paper that does not include an experimental setup with specific details like hyperparameters or system-level training settings.