reproducibilityindex.ai

Provably Optimal Algorithms for Generalized Linear Contextual Bandits

Authors: Lihong Li, Yu Lu, Dengyong Zhou

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this work, we propose an upper conﬁdence bound based algorithm for generalized linear contextual bandits, which achieves an O(d T) regret over T rounds with d dimensional feature vectors. This regret matches the minimax lower bound, up to logarithmic terms, and improves on the best previous result by a d factor, assuming the number of arms is ﬁxed. A key component in our analysis is to establish a new, sharp ﬁnite-sample conﬁdence bound for maximumlikelihood estimates in generalized linear models, which may be of independent interest.
Researcher Affiliation	Collaboration	Lihong Li 1 Yu Lu 2 Dengyong Zhou 1 1Microsoft Research, Redmond, WA 98052 2Department of Statistics, Yale University, New Haven, CT, USA.
Pseudocode	Yes	Algorithm 1 UCB-GLM Algorithm 2 CB-GLM Algorithm 3 Sup CB-GLM
Open Source Code	No	The paper does not provide any statements about open-source code availability, nor does it include links to code repositories.
Open Datasets	No	The paper defines a theoretical problem setting where 'a context consisting of a set of K feature vectors... which is drawn IID from an unknown distribution ν', but it does not specify or provide access information for a publicly available dataset.
Dataset Splits	No	The paper does not describe any experimental setup involving data splits (e.g., training, validation, test sets), as it is a theoretical work.
Hardware Specification	No	The paper is theoretical and focuses on algorithm design and analysis; therefore, it does not mention any specific hardware used for experiments.
Software Dependencies	No	The paper is theoretical and focuses on algorithm design and analysis; therefore, it does not mention any specific software dependencies with version numbers.
Experiment Setup	No	The paper focuses on theoretical analysis and algorithm development; thus, it does not provide details about experimental setup, hyperparameters, or training configurations.