Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning with Bandit Feedback in Potential Games
Authors: Amélie Heliou, Johanne Cohen, Panayotis Mertikopoulos
NeurIPS 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper examines the equilibrium convergence properties of no-regret learning with exponential weights in potential games. To establish convergence with minimal information requirements on the players side, we focus on two frameworks: the semi-bandit case (where players have access to a noisy estimate of their payoff vectors, including strategies they did not play), and the bandit case (where players are only able to observe their in-game, realized payoffs). In the semi-bandit case, we show that the induced sequence of play converges almost surely to a Nash equilibrium at a quasi-exponential rate. In the bandit case, the same result holds for ε-approximations of Nash equilibria if we introduce an exploration factor ε > 0 that guarantees that action choice probabilities never fall below ε. In particular, if the algorithm is run with a suitably decreasing exploration factor, the sequence of play converges to a bona fide Nash equilibrium with probability 1. |
| Researcher Affiliation | Academia | Johanne Cohen LRI-CNRS, Université Paris-Sud,Université Paris-Saclay, France EMAIL Amélie Héliou LIX, Ecole Polytechnique, CNRS, AMIBio, Inria, Université Paris-Saclay EMAIL Panayotis Mertikopoulos Univ. Grenoble Alpes, CNRS, Inria, LIG, F-38000, Grenoble, France EMAIL |
| Pseudocode | Yes | Algorithm 1 ε-HEDGE with generic feedback Algorithm 2 Exponential weights with annealing |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating that source code for the methodology is openly available. |
| Open Datasets | No | The paper is theoretical and does not mention using any datasets for training. Therefore, no information about public dataset availability is provided. |
| Dataset Splits | No | The paper is theoretical and does not describe experiments with data. Thus, there is no mention of training/validation/test dataset splits. |
| Hardware Specification | No | The paper is theoretical and focuses on mathematical proofs and algorithms. It does not describe any experimental setup or mention hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not describe any specific software dependencies or versions used for implementation or experimentation. |
| Experiment Setup | No | The paper is theoretical and focuses on algorithm design and convergence analysis. It does not describe an experimental setup with specific hyperparameters or training configurations. |