Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Provably Optimal Algorithms for Generalized Linear Contextual Bandits
Authors: Lihong Li, Yu Lu, Dengyong Zhou
ICML 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work, we propose an upper confidence bound based algorithm for generalized linear contextual bandits, which achieves an O(d T) regret over T rounds with d dimensional feature vectors. This regret matches the minimax lower bound, up to logarithmic terms, and improves on the best previous result by a d factor, assuming the number of arms is fixed. A key component in our analysis is to establish a new, sharp finite-sample confidence bound for maximumlikelihood estimates in generalized linear models, which may be of independent interest. |
| Researcher Affiliation | Collaboration | Lihong Li 1 Yu Lu 2 Dengyong Zhou 1 1Microsoft Research, Redmond, WA 98052 2Department of Statistics, Yale University, New Haven, CT, USA. |
| Pseudocode | Yes | Algorithm 1 UCB-GLM Algorithm 2 CB-GLM Algorithm 3 Sup CB-GLM |
| Open Source Code | No | The paper does not provide any statements about open-source code availability, nor does it include links to code repositories. |
| Open Datasets | No | The paper defines a theoretical problem setting where 'a context consisting of a set of K feature vectors... which is drawn IID from an unknown distribution ν', but it does not specify or provide access information for a publicly available dataset. |
| Dataset Splits | No | The paper does not describe any experimental setup involving data splits (e.g., training, validation, test sets), as it is a theoretical work. |
| Hardware Specification | No | The paper is theoretical and focuses on algorithm design and analysis; therefore, it does not mention any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and focuses on algorithm design and analysis; therefore, it does not mention any specific software dependencies with version numbers. |
| Experiment Setup | No | The paper focuses on theoretical analysis and algorithm development; thus, it does not provide details about experimental setup, hyperparameters, or training configurations. |