reproducibilityindex.ai

Bandit Learning in Concave N-Person Games

Authors: Mario Bravo, David Leslie, Panayotis Mertikopoulos

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This paper examines the long-run behavior of learning with bandit feedback in non-cooperative concave games... our analysis shows that no-regret learning based on mirror descent with bandit feedback converges to Nash equilibrium with probability 1. We also derive an upper bound for the convergence rate of the process that nearly matches the best attainable rate for single-agent bandit stochastic optimization. Theorem 5.1. Suppose that the players of a monotone game G G(N, X, u) follow (MD-b) with step-size γn and query radius δn such that... Then, the sequence of realized actions ˆXn converges to Nash equilibrium with probability 1. Theorem 5.2. Let x be the (necessarily unique) Nash equilibrium of a β-strongly monotone game... we have E[ ˆXn x 2] = O(n 1/3).
Researcher Affiliation	Collaboration	Mario Bravo Universidad de Santiago de Chile Departamento de Matemática y Ciencia de la Computación mario.bravo.g@usach.cl, David Leslie Lancaster University & PROWLER.io d.leslie@lancaster.ac.uk, Panayotis Mertikopoulos Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP LIG 38000 Grenoble, France. panayotis.mertikopoulos@imag.fr
Pseudocode	Yes	Algorithm 1: Multi-agent mirror descent with bandit feedback (player indices suppressed)
Open Source Code	No	The paper is theoretical and does not mention releasing source code or provide links to a repository.
Open Datasets	No	The paper is theoretical and does not use any datasets for training or evaluation.
Dataset Splits	No	The paper is theoretical and does not describe dataset splits for training, validation, or testing.
Hardware Specification	No	The paper is theoretical and does not mention any specific hardware used for running experiments.
Software Dependencies	No	The paper is theoretical and does not specify software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup with hyperparameters or training configurations for empirical evaluation. It describes parameters for a theoretical algorithm, but not an experimental setup.