Efficient Algorithms for Adversarial Contextual Learning
Authors: Vasilis Syrgkanis, Akshay Krishnamurthy, Robert Schapire
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We provide the first oracle efficient sublinear regret algorithms for adversarial versions of the contextual bandit problem. Our algorithms fall into the Follow-The-Perturbed-Leader family (Kalai & Vempala, 2005) and achieve regret O(T 3/4p K log(N)) in the transductive setting and O(T 2/3d3/4K p log(N)) in the separator setting, where T is the number of rounds, K is the number of actions, N is the number of baseline policies, and d is the size of the separator. The proof of the theorem is based on a well-known bethe-leader argument. |
| Researcher Affiliation | Collaboration | Vasilis Syrgkanis VASY@MICROSOFT.COM Microsoft Research, 641 Avenue of the Americas, New York, NY 10011 USA Akshay Krishnamurthy AKSHAYKR@CS.CMU.EDU Microsoft Research, 641 Avenue of the Americas, New York, NY 10011 USA Robert E. Schapire SCHAPIRE@MICROSOFT.COM Microsoft Research, 641 Avenue of the Americas, New York, NY 10011 USA |
| Pseudocode | Yes | Algorithm 1 Follow the perturbed leader with fake sample perturbations FTPL. Algorithm 2 Contextual Follow the Perturbed Leader Algorithm CONTEXT-FTPL(X, ǫ). Algorithm 3 Contextual Semi-Bandit Algorithm CONTEXT-SEMI-BANDIT-FTPL(X, ǫ, L). |
| Open Source Code | No | The paper does not provide any statement about making its source code publicly available, nor does it include links to a code repository. |
| Open Datasets | No | This is a theoretical paper that does not conduct empirical experiments, therefore it does not use or make publicly available any dataset for training. |
| Dataset Splits | No | This is a theoretical paper that does not conduct empirical experiments and therefore does not specify training, validation, or test dataset splits. |
| Hardware Specification | No | This is a theoretical paper that does not describe any specific hardware used for running experiments. |
| Software Dependencies | No | This is a theoretical paper and does not specify any software dependencies with version numbers. |
| Experiment Setup | No | This is a theoretical paper and does not describe an experimental setup with specific hyperparameters or training settings. |