Improving offline evaluation of contextual bandit algorithms via bootstrapping techniques

Authors: Jérémie Mary, Philippe Preux, Olivier Nicol

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide both theoretical and experimental proofs of its superiority compared to state-of-the-art methods, as well as an analysis of the convergence of the measure of quality.
Researcher Affiliation Academia University of Lille / LIFL (CNRS) & INRIA Lille Nord Europe, 59650 Villeneuve d Ascq, France
Pseudocode Yes The complete procedure, called Bootstrapped Replay on Expanded Data (BRED), is implemented in algorithm 1.
Open Source Code No The paper does not provide any explicit statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets Yes Note that the experiments are run on synthetic data, for reasons that we will detail and also on a large publicly available dataset. (Introduction). Also: on a publicly available dataset made from Yahoo! server logs and on synthetic data presenting the time acceleration issue. (Conclusion) and Yahoo! R6B dataset (Yahoo! Research, 2012) (Section 3).
Dataset Splits No The paper describes using synthetic data and portions of the Yahoo! R6B dataset, and how "ground truth" was estimated for comparison. However, it does not provide specific train/validation/test splits (e.g., percentages or sample counts) or a detailed methodology for creating a separate validation set for model development or hyperparameter tuning.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments, only general statements about data and algorithms.
Software Dependencies No The paper mentions specific algorithms but does not provide details on the software dependencies or specific version numbers of any libraries, frameworks, or programming languages used for implementation.
Experiment Setup Yes Figure 2 displays the results and interpretation of an experiment which consists in evaluating Lin UCB(α = 1) using the different methods. ... Empirically, a good choice for the level of jitter seems to be a function in O(1/√T), with T the size of the dataset. Note that this is proportional to the standard distribution of the posterior of the data. The results confirm our intuition: jittering is very important when the dataset is small but gets less and less necessary as the dataset grows. (h = 50/√T here)