Improving offline evaluation of contextual bandit algorithms via bootstrapping techniques
Authors: Jérémie Mary, Philippe Preux, Olivier Nicol
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide both theoretical and experimental proofs of its superiority compared to state-of-the-art methods, as well as an analysis of the convergence of the measure of quality. |
| Researcher Affiliation | Academia | University of Lille / LIFL (CNRS) & INRIA Lille Nord Europe, 59650 Villeneuve d Ascq, France |
| Pseudocode | Yes | The complete procedure, called Bootstrapped Replay on Expanded Data (BRED), is implemented in algorithm 1. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | Note that the experiments are run on synthetic data, for reasons that we will detail and also on a large publicly available dataset. (Introduction). Also: on a publicly available dataset made from Yahoo! server logs and on synthetic data presenting the time acceleration issue. (Conclusion) and Yahoo! R6B dataset (Yahoo! Research, 2012) (Section 3). |
| Dataset Splits | No | The paper describes using synthetic data and portions of the Yahoo! R6B dataset, and how "ground truth" was estimated for comparison. However, it does not provide specific train/validation/test splits (e.g., percentages or sample counts) or a detailed methodology for creating a separate validation set for model development or hyperparameter tuning. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments, only general statements about data and algorithms. |
| Software Dependencies | No | The paper mentions specific algorithms but does not provide details on the software dependencies or specific version numbers of any libraries, frameworks, or programming languages used for implementation. |
| Experiment Setup | Yes | Figure 2 displays the results and interpretation of an experiment which consists in evaluating Lin UCB(α = 1) using the different methods. ... Empirically, a good choice for the level of jitter seems to be a function in O(1/√T), with T the size of the dataset. Note that this is proportional to the standard distribution of the posterior of the data. The results confirm our intuition: jittering is very important when the dataset is small but gets less and less necessary as the dataset grows. (h = 50/√T here) |