Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Delay and Cooperation in Nonstochastic Bandits

Authors: Nicolò Cesa-Bianchi, Claudio Gentile, Yishay Mansour

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We introduce Exp3-Coop, a cooperative version of the Exp3 algorithm and prove that with K actions and N agents the average per-agent regret after T rounds is at most of order q d + 1 + K N α d (T ln K), where α d is the independence number of the d-th power of the communication graph G. We then show that for any connected graph, for d = K the regret bound is K1/4 T, strictly better than the minimax regret KT for noncooperating agents.
Researcher Affiliation	Collaboration	Nicol o Cesa-Bianchi EMAIL Department of Computer Science & DSRC Universit a degli Studi di Milano 20133 Milano, Italy Claudio Gentile EMAIL Google Research New York, NY, USA Yishay Mansour EMAIL Google Research and Tel-Aviv University Tel-Aviv 6997801, Israel
Pseudocode	Yes	Our learning protocol is summarized in Figure 1, while Figure 2 contains a pictorial example. Our ﬁrst algorithm, called Exp3-Coop (Cooperative Exp3) is described in Figure 3. The Exp3-Coop2 Algorithm Parameters: Undirected graph G = (V, E); learning rate η; exploration parameter δ > 0. The Exp3-Coop-Mix Algorithm Parameters: Undirected communication graph G = (V, E); maximal delay d; delay distribution D over {0, 1, . . . , d 1}; learning rate η > 0.
Open Source Code	No	The paper does not provide explicit statements or links indicating that source code for the described methodologies is openly available.
Open Datasets	No	The paper is theoretical and does not describe or use specific datasets for empirical evaluation. It refers to abstract 'action sets' and 'loss vectors' in its mathematical framework.
Dataset Splits	No	The paper is theoretical and does not perform experiments with datasets, therefore, there is no mention of dataset splits.
Hardware Specification	No	The paper focuses on theoretical analysis and algorithm design without performing empirical experiments, so no hardware specifications are provided.
Software Dependencies	No	The paper is theoretical and does not implement or run its algorithms, so no software dependencies with version numbers are mentioned.
Experiment Setup	No	The paper is theoretical and focuses on algorithm design and regret analysis, not empirical experimentation. It discusses algorithmic parameters (e.g., 'delay d', 'learning rate η', 'exploration parameter δ') in a theoretical context, but does not provide specific values for an experimental setup.