Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update

Authors: Yu-Jie Zhang, Sheng-An Xu, Peng Zhao, Masashi Sugiyama

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This section evaluates the proposed method on two representative GLB problems: logistic bandits (µ(z) = 1/(1 + e z)) with bounded rewards, and Poisson bandits (µ(z) = ez), which pose a distinct challenge as an unbounded GLB setting. We also conduct experiments on real data from the Covertype dataset [Blackard, 1998], with more detailed results provided in Appendix E.
Researcher Affiliation	Academia	1 RIKEN AIP, Tokyo, Japan 2 National Key Laboratory for Novel Software Technology, Nanjing University, China 3 School of Artificial Intelligence, Nanjing University, China 4 The University of Tokyo, Chiba, Japan Correspondence: Peng Zhao <EMAIL>
Pseudocode	Yes	Algorithm 1 GLB-OMD Input: Self-concordant constant R, Lipchitz constant Lµ, parameter radius S, confidence level δ. 1: Initialize θ1 Θ := {θ Rd \| θ 2 S} and H1 = λId. 2: for t = 1 to T do 3: Construct the confidence set Ct(δ) according to (5). 4: Select the arm Xt according to rule (6) and receive the reward rt. 5: Update the online estimator θt+1 by (3) and set Ht+1 = Ht + 2ℓt(θt+1). 6: end for
Open Source Code	No	Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: The code and data are not released.
Open Datasets	Yes	We also conduct experiments on real data from the Covertype dataset [Blackard, 1998]
Dataset Splits	No	The paper describes partitioning the data into K=60 clusters for defining arms and binarizing rewards for the Covertype dataset. It mentions setting the horizon T=1000. However, it does not explicitly provide information on train/test/validation splits (percentages, sample counts, or methodology) for reproducing data partitioning.
Hardware Specification	Yes	All the experiments were conducted on Intel Xeon Gold 6242R processors (40 cores, 4.1GHz base frequency).
Software Dependencies	No	The algorithms were implemented in Python, utilizing the scipy library for numerical computations, such as solving non-linear optimization problems and calculating vector norms, and employing np.linalg.pinv to compute the pseudo-inverse of matrices. The running time was measured using the time library.
Experiment Setup	Yes	Throughout our experiments, all algorithm parameters were configured according to their theoretical derivations without additional fine-tuning, with the sole exception of the regularization parameter λ. To ensure a fair comparison, we adopted a unified approach for setting λ across different algorithm categories: we set λ = d for all efficient online algorithms (including GLB-OMD, RS-GLin CB, ECOLog, and GLOC), while using λ = d log(1 + t) for offline algorithms that require regularization. For this task, we set the horizon to T = 1000 and the confidence parameter to δ = 0.01. After analyzing the data, we set S = 6 and κ = 200.