Design of Experiments for Stochastic Contextual Linear Bandits

Authors: Andrea Zanette, Kefan Dong, Jonathan N Lee, Emma Brunskill

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a theoretical analysis as well as numerical experiments on both synthetic and real-world datasets. and 7 Experiments We now study the empirical properties of the planner and sampler in a synthetic setting and a real-world dataset from the Yahoo! Learning to Rank challenge (Chapelle and Chang, 2011).
Researcher Affiliation Academia Andrea Zanette EECS Department University of California, Berkeley Berkeley, CA zanette@berkeley.edu, Kefan Dong Department of Computer Science Stanford University Stanford, CA kefandong@stanford.edu, Jonathan Lee Department of Computer Science Stanford University Stanford, CA jnl@stanford.edu, Emma Brunskill Department of Computer Science Stanford University Stanford, CA ebrun@cs.stanford.edu
Pseudocode Yes Algorithm 1 PLANNER (Reward-free LINUCB) and Algorithm 2 SAMPLER
Open Source Code No No explicit statement providing access to open-source code for the methodology was found. The arXiv link is for the paper itself, not for code.
Open Datasets Yes We evaluate the planner-sampler performance on real-world data from the Yahoo! Learning to Rank challenge (Chapelle and Chang, 2011)
Dataset Splits Yes The dataset is already divided into training, validation, and testing data. and During the offline phase, we run the planner on the validation set where none of the relevance scores are observed in order to generate the sampling policies.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running the experiments are provided in the paper.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiments.
Experiment Setup Yes All algorithms used λ = 1 and the planner used α = 1 as these worked well. See appendix for more λ settings. and The planner uses α = 1; we did not find significant improvements with α < 1. Additional values for λ are reported in the supplementary material.