Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

No Weighted-Regret Learning in Adversarial Bandits with Delays

Authors: Ilai Bistritz, Zhengyuan Zhou, Xi Chen, Nicholas Bambos, Jose Blanchet

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We prove that the FKM algorithm with n dimensions achieves an expected regret of O n T 3 4 + n T 1 3 D 1 3 and the EXP3 algorithm with K arms achieves an expected regret of O p log K (KT + D) even when D = PT t=1 dt and T are unknown. These bounds use a novel doubling trick that, under mild assumptions, provably retains the regret bound for when D and T are known. Using these bounds, we show that FKM and EXP3 have no weighted-regret even for dt = O (t log t). Therefore, algorithms with no weighted-regret can be used to approximate a CCE of a ﬁnite or convex unknown game that can only be simulated with bandit feedback, even if the simulation involves signiﬁcant delays. Our first main contribution in this paper is to show that even with delays that cause a linear regret, the expected weighted ergodic distribution may still converge to the set of CCE, and the weighted ergodic average may still converge (in L1) to the set of NE for a two-player zero-sum game.
Researcher Affiliation	Academia	Ilai Bistritz EMAIL Department of Electrical Engineering Stanford University Stanford, CA 94305, USA Zhengyuan Zhou EMAIL Stern School of Business New York University New York, NY 10003, USA Xi Chen EMAIL Stern School of Business New York University New York, NY 10003, USA Nicholas Bambos EMAIL Department of Electrical Engineering Stanford University Stanford, CA 94305, USA Jose Blanchet EMAIL Department of Management Science & Engineering Stanford University Stanford, CA 94305, USA
Pseudocode	Yes	Algorithm 1 2D Doubling Trick Wrapper for Unknown T and D; Algorithm 2 FKM with delays; Algorithm 3 EXP3 with delays
Open Source Code	No	No explicit statement about code release or links to code repositories are found. The paper's license and attribution link refer to the paper itself, not source code for the methodology.
Open Datasets	No	The paper is theoretical and analyzes algorithms with abstract cost functions lt(at). It does not mention using any specific dataset or provide information on dataset access.
Dataset Splits	No	The paper does not conduct experiments on datasets, therefore no dataset splits are mentioned.
Hardware Specification	No	The paper is theoretical and does not describe any experiments that would require specific hardware.
Software Dependencies	No	The paper is theoretical and does not mention any specific software, libraries, or solvers with version numbers used for implementation or experimentation.
Experiment Setup	No	The paper does not describe any experiments; therefore, no experimental setup details or hyperparameters are provided.