Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Does Stochastic Gradient really succeed for bandits?
Authors: Dorian Baudry, Emmeran Johnson, Simon Vary, Ciara Pike-Burke, Patrick Rebeschini
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, in Section 4 we present some synthetic experiments that illustrate our theoretical findings. We now provide empirical support for the theoretical findings presented in previous sections, focusing on the performance of SGB as a function of its learning rate η. |
| Researcher Affiliation | Academia | 1 Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LIG, 38000 Grenoble, France1 2 Department of Statistics, University of Oxford 3 Department of Mathematics, Imperial College London |
| Pseudocode | Yes | For completeness, we provide the pseudo-code of SGB in Algorithm 1 in Appendix A.1. |
| Open Source Code | No | we will make our code available with the paper at the time of publication. |
| Open Datasets | No | For simplicity, all experiments use Rademacher distributions unless stated otherwise. |
| Dataset Splits | No | The paper discusses "independent trajectories of SGB" and "empirical regret" but does not mention training/test/validation splits as it deals with sequential decision-making in a bandit setting, not typical supervised learning. |
| Hardware Specification | No | the experiments can be reproduced on a standard personal laptop. |
| Software Dependencies | No | The pseudo-code is provided, and the main algorithm is very simple to implement. |
| Experiment Setup | Yes | For the second experiment, we consider K = 10 arms and the instance defined by ν1 = Rad(0.1), ν2 = δ0 and ν3 = = ν10 = δ 1, in order to support Thm. 3, and more precisely the conjecture that the critical learning rate is η = 2 /K for K-armed problems. Thus, we compare the performance of SGB with learning rates (ηi)i [5] = 2 , , 5 . For each setup, we run 104 independent trajectories of SGB over horizon T = 2 104. |