Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

Authors: Dylan J. Foster, Akshay Krishnamurthy

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We use surrogate losses to obtain several new regret bounds and new algorithms for contextual bandit learning. Using the ramp loss, we derive new margin-based regret bounds in terms of standard sequential complexity measures of a benchmark class of real-valued regression functions. Using the hinge loss, we derive an efficient algorithm with a d T-type mistake bound against benchmark policies induced by d-dimensional regressors. Under realizability assumptions, our results also yield classical regret bounds.
Researcher Affiliation Collaboration Dylan J. Foster Cornell University djfoster@cs.cornell.edu Akshay Krishnamurthy Microsoft Research, NYC akshay@cs.umass.edu
Pseudocode Yes Algorithm 1 HINGE-LMC and Algorithm 2 Langevin Monte Carlo (LMC)
Open Source Code No The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement) for open-source code related to the described methodology.
Open Datasets No The paper focuses on theoretical contributions (regret bounds, algorithms) and does not describe experiments that use a specific dataset for training.
Dataset Splits No The paper is theoretical and does not describe experiments that involve dataset splits for validation.
Hardware Specification No The paper is theoretical and does not describe any experiments requiring specific hardware specifications.
Software Dependencies No The paper is theoretical and does not describe empirical implementations that would require specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and focuses on algorithm design and theoretical bounds, thus it does not provide specific experimental setup details such as hyperparameters for empirical runs.