reproducibilityindex.ai

Strategizing against No-regret Learners

Authors: Yuan Deng, Jon Schneider, Balasubramanian Sivan

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We study this question and show that under some mild assumptions, the player can always guarantee himself a utility of at least what he would get in a Stackelberg equilibrium of the game. When the no-regret learner has only two actions, we show that the player cannot get any higher utility than the Stackelberg equilibrium utility. But when the no-regret learner has more than two actions and plays a mean-based no-regret strategy, we show that the player can get strictly higher than the Stackelberg equilibrium utility. We provide a characterization of the optimal game-play for the player against a mean-based no-regret learner as a solution to a control problem. When the no-regret learner s strategy also guarantees him a no-swap regret, we show that the player cannot get anything higher than a Stackelberg equilibrium utility.
Researcher Affiliation	Collaboration	Yuan Deng Duke University ericdy@cs.duke.edu Jon Schneider Google Research jschnei@google.com Balasubramanian Sivan Google Research balusivan@google.com
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access to source code for the methodology described.
Open Datasets	No	The paper is theoretical and does not use datasets for training.
Dataset Splits	No	The paper is theoretical and does not describe any validation splits or processes.
Hardware Specification	No	The paper is theoretical and does not describe any hardware specifications used for experiments.
Software Dependencies	No	The paper is theoretical and does not describe any specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe any experimental setup details such as hyperparameters or training configurations.