Robust Contextual Bandits via Bootstrapping
Authors: Qiao Tang, Hong Xie, Yunni Xia, Jia Lee, Qingsheng Zhu12182-12189
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We prove that the Boot Lin UCB has a sub-linear regret upper bound and also conduct extensive experiments to validate its superior performance. We conduct extensive experiments to validate its superior performance of our Boot Lin UCB algorithm over the latest bootstrapping based Lin UCB algorithm (Hao et al. 2019) and the classical Lin UCB algorithm (Chu et al. 2011). |
| Researcher Affiliation | Academia | Chongqing Key Laboratory of Software Theory and Technology, Chongqing University |
| Pseudocode | Yes | Algorithm 1 Boot Lin UCB algorithmic framework. Algorithm 2 Boot Quantile(xa, Ht 1, 1 2αt) |
| Open Source Code | No | The paper mentions 'details refer to our code' but does not provide a specific link, repository, or explicit statement of open-source availability for its methodology. |
| Open Datasets | No | The paper uses synthetic data that is generated by the authors, stating 'We generate the feature vectors of A arms as follows: (1) generate min{d, A} orthogonal feature vectors with unit square norm (details refer to our code); (2) each of the remaining A min{d, A} feature vectors is drawn from [0, 1]d uniformly at random. The preference parameter θ is drawn from [0, 1]d uniformly at random.' However, it does not provide concrete access information (link, DOI, formal citation) to this generated dataset. |
| Dataset Splits | No | The paper describes an online learning framework over 'T = 2000 decision rounds' and uses synthetic data, but it does not specify explicit training, validation, or test dataset splits in the traditional sense for reproducibility. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or cloud computing specifications. |
| Software Dependencies | No | The paper does not provide specific software dependencies, such as programming language versions or library versions, that were used for the experiments. |
| Experiment Setup | Yes | Consider T = 2000 decision rounds. ... We set αt = 1/ t + 2, δ = 1/(t + 2) for our Boot Lin UCB, and set αt = 1/ t + 2 for the Lin UCB algorithm. ... Unless we state explicitly, we consider the following default parameters: A = 20 arms, features with d = 10 dimension, regularization parameter λ = 1, reward variance σ = 1. ... We use Monte Carlo simulation to estimate the quantile estimator bqt(x, bϵ, α) with 1000 simulation rounds. |