Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

Authors: Dylan J. Foster, Akshay Krishnamurthy

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In a large-scale empirical evaluation, we find that our approach typically outperforms comparable non-first-order methods. In Section 5, we evaluate Fast CB on the large-scale contextual bandit benchmark of Bietti et al. [13] and find that it typically outperforms Square CB and other non-adaptive baselines [35].
Researcher Affiliation Industry Dylan J. Foster Microsoft Research, New England dylanfoster@microsoft.com Akshay Krishnamurthy Microsoft Research, NYC akshaykr@microsoft.com
Pseudocode Yes Algorithm 1 Fast CB ( Fast Rates for Contextual Bandits )
Open Source Code No No explicit statement or direct link to the source code for the methodology described in this paper was found. The authors state they implemented Fast CB in Vowpal Wabbit, but do not provide a link to their specific implementation or confirm its open-source release as part of the VW project.
Open Datasets Yes The contextual bandit bake-off is a collection of over 500 multiclass, multilabel, and cost-sensitive classification datasets available on the openml.org platform [64].
Dataset Splits No No explicit information about training, validation, or test dataset splits (e.g., specific percentages or sample counts for partitioning the data) was found. The paper mentions using "progressive validation" as an evaluation method, but this refers to a loss calculation over time rather than a static data split.
Hardware Specification No No specific hardware details (e.g., exact GPU/CPU models, memory) used for running the experiments were provided. The paper only vaguely mentions "Google Cloud credits used for experiments".
Software Dependencies No No specific version numbers for key software components were provided. The paper mentions using "Vowpal Wabbit (VW) online learning library" but does not specify its version.
Experiment Setup Yes Following Foster et al. [35], we set γt = γ0tρ, where γ0 {10, 50, 100, 400, 700, 103} and ρ {.25, .5} are hyperparameters.