Breaking the $\sqrtT$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits

Authors: Avishek Ghosh, Abishek Sankararaman

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we validate our theoretical findings of Section 5 via simulations. We assume that the contexts are drawn i.i.d from Unif[ 1/d] d. We run Algorithm 1 with K = 20 arms with different dimension d = {20, 15, 30}. Moreover, we compare our results with that of the OFUL (Algorithm 2), and show the LR-SCB attanins much smaller regret compared to OFUL.
Researcher Affiliation Academia 1Halıcıo glu Data Science Institute (HDSI), UC San Diego, USA 2Dept. of Electrical Engg. and Computer Sciences, UC Berkeley, USA (Abishek is currently with Amazon AWS AI, Palo Alto, USA but work done outside the scope of Amazon).
Pseudocode Yes Algorithm 1 Low Regret Stochastic Contextual Bandits (LR-SCB)
Open Source Code No The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets No The paper describes how contexts were generated for simulations ('We assume that the contexts are drawn i.i.d from Unif[ 1/d] d'), but it does not provide access information (link, DOI, formal citation) for a publicly available or open dataset.
Dataset Splits No The paper describes an epoch-based learning algorithm, but it does not specify explicit training/validation/test dataset splits, percentages, or a detailed splitting methodology.
Hardware Specification No The paper does not provide specific details about the hardware used for running its simulations, such as GPU or CPU models.
Software Dependencies No The paper does not provide specific software dependency details, such as library names with version numbers, used for its experiments or analysis.
Experiment Setup No The paper mentions running simulations with K = 20 arms and different dimensions d = {20, 25, 30} over 50 trials, but it does not provide specific experimental setup details such as hyperparameter values, training configurations, or optimizer settings.