Are sample means in multi-armed bandits positively or negatively biased?
Authors: Jaehyeok Shin, Aaditya Ramdas, Alessandro Rinaldo
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide examples of optimistic rules of each type, demonstrate that simulations confirm our theoretical predictions, and pose some natural but hard open problems. [...] In Section 4, we demonstrate the correctness of our theoretical predictions through simulations in a variety of practical situations. [...] 4 Numerical experiments |
| Researcher Affiliation | Academia | Department of Statistics and Data Science1 Machine Learning Department2 Carnegie Mellon University {shinjaehyeok, aramdas, arinaldo}@cmu.edu |
| Pseudocode | No | The paper describes algorithms like lil UCB using textual explanations and mathematical formulas (Section 4.3), but it does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statement about releasing source code for the methodology described, nor does it provide any links to a code repository. |
| Open Datasets | No | The paper conducts simulations using 'unit-variance Gaussian arms' (e.g., Section 4.1) which are defined distributions for the purpose of the experiment, rather than external, publicly available datasets for which access information would be provided. |
| Dataset Splits | No | The paper describes simulation setups such as using 'three unit-variance Gaussian arms' and repeating trials (Section 4.1), but it does not specify any training, validation, or test dataset splits, as the experiments involve simulations rather than traditional model training on split datasets. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU or CPU models, or memory specifications, used for running the experiments. |
| Software Dependencies | No | The paper mentions specific algorithms and references related works but does not provide details about the programming languages, software libraries, or their specific version numbers used for the simulations or implementations. |
| Experiment Setup | Yes | To demonstrate this, we conduct a simulation study in which we have three unit-variance Gaussian arms with µ1 = 1, µ2 = 2 and µ3 = 3. After sampling once from each arm, greedy, UCB and Thompson sampling are used to continue sampling until T = 200. We repeat the whole process from scratch 104 times for each algorithm to get an accurate estimate for the bias. [...] We choose M = 200, w = 10 and α = 0.1. As before, we repeat each experiment 104 times for each setting. [...] We set 3 unit-variance Gaussian arms with means (µ1, µ2, µ3) = (g, 0, g) for each gap parameter g = 1, 3, 5. We conduct 104 trials of the lil UCB algorithm with a valid choice of parameters described in Jamieson et al. [2014, Section 5]. |