Breaking the $\sqrtT$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits
Authors: Avishek Ghosh, Abishek Sankararaman
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we validate our theoretical findings of Section 5 via simulations. We assume that the contexts are drawn i.i.d from Unif[ 1/d] d. We run Algorithm 1 with K = 20 arms with different dimension d = {20, 15, 30}. Moreover, we compare our results with that of the OFUL (Algorithm 2), and show the LR-SCB attanins much smaller regret compared to OFUL. |
| Researcher Affiliation | Academia | 1Halıcıo glu Data Science Institute (HDSI), UC San Diego, USA 2Dept. of Electrical Engg. and Computer Sciences, UC Berkeley, USA (Abishek is currently with Amazon AWS AI, Palo Alto, USA but work done outside the scope of Amazon). |
| Pseudocode | Yes | Algorithm 1 Low Regret Stochastic Contextual Bandits (LR-SCB) |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper describes how contexts were generated for simulations ('We assume that the contexts are drawn i.i.d from Unif[ 1/d] d'), but it does not provide access information (link, DOI, formal citation) for a publicly available or open dataset. |
| Dataset Splits | No | The paper describes an epoch-based learning algorithm, but it does not specify explicit training/validation/test dataset splits, percentages, or a detailed splitting methodology. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running its simulations, such as GPU or CPU models. |
| Software Dependencies | No | The paper does not provide specific software dependency details, such as library names with version numbers, used for its experiments or analysis. |
| Experiment Setup | No | The paper mentions running simulations with K = 20 arms and different dimensions d = {20, 25, 30} over 50 trials, but it does not provide specific experimental setup details such as hyperparameter values, training configurations, or optimizer settings. |