Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Does Stochastic Gradient really succeed for bandits?

Authors: Dorian Baudry, Emmeran Johnson, Simon Vary, Ciara Pike-Burke, Patrick Rebeschini

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, in Section 4 we present some synthetic experiments that illustrate our theoretical findings. We now provide empirical support for the theoretical findings presented in previous sections, focusing on the performance of SGB as a function of its learning rate η.
Researcher Affiliation	Academia	1 Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LIG, 38000 Grenoble, France1 2 Department of Statistics, University of Oxford 3 Department of Mathematics, Imperial College London
Pseudocode	Yes	For completeness, we provide the pseudo-code of SGB in Algorithm 1 in Appendix A.1.
Open Source Code	No	we will make our code available with the paper at the time of publication.
Open Datasets	No	For simplicity, all experiments use Rademacher distributions unless stated otherwise.
Dataset Splits	No	The paper discusses "independent trajectories of SGB" and "empirical regret" but does not mention training/test/validation splits as it deals with sequential decision-making in a bandit setting, not typical supervised learning.
Hardware Specification	No	the experiments can be reproduced on a standard personal laptop.
Software Dependencies	No	The pseudo-code is provided, and the main algorithm is very simple to implement.
Experiment Setup	Yes	For the second experiment, we consider K = 10 arms and the instance defined by ν1 = Rad(0.1), ν2 = δ0 and ν3 = = ν10 = δ 1, in order to support Thm. 3, and more precisely the conjecture that the critical learning rate is η = 2 /K for K-armed problems. Thus, we compare the performance of SGB with learning rates (ηi)i [5] = 2 , , 5 . For each setup, we run 104 independent trajectories of SGB over horizon T = 2 104.