reproducibilityindex.ai

Efficient Algorithms for Adversarial Contextual Learning

Authors: Vasilis Syrgkanis, Akshay Krishnamurthy, Robert Schapire

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We provide the first oracle efficient sublinear regret algorithms for adversarial versions of the contextual bandit problem. Our algorithms fall into the Follow-The-Perturbed-Leader family (Kalai & Vempala, 2005) and achieve regret O(T 3/4p K log(N)) in the transductive setting and O(T 2/3d3/4K p log(N)) in the separator setting, where T is the number of rounds, K is the number of actions, N is the number of baseline policies, and d is the size of the separator. The proof of the theorem is based on a well-known bethe-leader argument.
Researcher Affiliation	Collaboration	Vasilis Syrgkanis VASY@MICROSOFT.COM Microsoft Research, 641 Avenue of the Americas, New York, NY 10011 USA Akshay Krishnamurthy AKSHAYKR@CS.CMU.EDU Microsoft Research, 641 Avenue of the Americas, New York, NY 10011 USA Robert E. Schapire SCHAPIRE@MICROSOFT.COM Microsoft Research, 641 Avenue of the Americas, New York, NY 10011 USA
Pseudocode	Yes	Algorithm 1 Follow the perturbed leader with fake sample perturbations FTPL. Algorithm 2 Contextual Follow the Perturbed Leader Algorithm CONTEXT-FTPL(X, ǫ). Algorithm 3 Contextual Semi-Bandit Algorithm CONTEXT-SEMI-BANDIT-FTPL(X, ǫ, L).
Open Source Code	No	The paper does not provide any statement about making its source code publicly available, nor does it include links to a code repository.
Open Datasets	No	This is a theoretical paper that does not conduct empirical experiments, therefore it does not use or make publicly available any dataset for training.
Dataset Splits	No	This is a theoretical paper that does not conduct empirical experiments and therefore does not specify training, validation, or test dataset splits.
Hardware Specification	No	This is a theoretical paper that does not describe any specific hardware used for running experiments.
Software Dependencies	No	This is a theoretical paper and does not specify any software dependencies with version numbers.
Experiment Setup	No	This is a theoretical paper and does not describe an experimental setup with specific hyperparameters or training settings.