reproducibilityindex.ai

Gradient-free Online Learning in Continuous Games with Delayed Rewards

Authors: Amélie Héliou, Panayotis Mertikopoulos, Zhengyuan Zhou

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this general context, we derive new bounds for the agents regret; furthermore, under a standard diagonal concavity assumption, we show that the induced sequence of play converges to Nash equilibrium (NE) with probability 1, even if the delay between choosing an action and receiving the corresponding reward is unbounded.
Researcher Affiliation	Collaboration	1Criteo AI Lab 2Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LIG, 38000 Grenoble, France 3Stern School of Business, NYU, and IBM Research. Correspondence to: Panayotis Mertikopoulos <panayotis.mertikopoulos@imag.fr>.
Pseudocode	Yes	Algorithm 1: gradient-free online learning with delayed feedback (GOLD) [focal player view]
Open Source Code	No	The paper does not provide any concrete access to source code for the methodology described.
Open Datasets	No	The paper is theoretical and does not describe experiments using datasets, thus no information about public dataset availability is provided.
Dataset Splits	No	The paper is theoretical and does not describe experiments, thus no information about training/test/validation splits is provided.
Hardware Specification	No	The paper is theoretical and does not describe experiments, thus no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not describe experiments that would require software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup with hyperparameters or training settings.