Contextual bandits with surrogate losses: Margin bounds and efficient algorithms
Authors: Dylan J. Foster, Akshay Krishnamurthy
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We use surrogate losses to obtain several new regret bounds and new algorithms for contextual bandit learning. Using the ramp loss, we derive new margin-based regret bounds in terms of standard sequential complexity measures of a benchmark class of real-valued regression functions. Using the hinge loss, we derive an efficient algorithm with a d T-type mistake bound against benchmark policies induced by d-dimensional regressors. Under realizability assumptions, our results also yield classical regret bounds. |
| Researcher Affiliation | Collaboration | Dylan J. Foster Cornell University djfoster@cs.cornell.edu Akshay Krishnamurthy Microsoft Research, NYC akshay@cs.umass.edu |
| Pseudocode | Yes | Algorithm 1 HINGE-LMC and Algorithm 2 Langevin Monte Carlo (LMC) |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement) for open-source code related to the described methodology. |
| Open Datasets | No | The paper focuses on theoretical contributions (regret bounds, algorithms) and does not describe experiments that use a specific dataset for training. |
| Dataset Splits | No | The paper is theoretical and does not describe experiments that involve dataset splits for validation. |
| Hardware Specification | No | The paper is theoretical and does not describe any experiments requiring specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not describe empirical implementations that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and focuses on algorithm design and theoretical bounds, thus it does not provide specific experimental setup details such as hyperparameters for empirical runs. |