Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Generalizing the Regret: an Analysis of Lower and Upper Bounds

Authors: Marco Mussi, Alberto Maria Metelli

JAIR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this appendix, we provide numerical examples to empirically validate our findings. We consider the performances of UCB1 with a bandit made of K = 10 arms over 10 runs and comparing the empirical regret (EXP, mean ± std) with the instance-dependent lower (LB) and upper (UB) bounds, for different choices of function g and for different time horizons T ∈ {1·10^5, 5·10^5, 1·10^6}. The results are presented in Table 2. We can observe how the empirical results are consistent with our theoretical findings for all the g and all the time horizons T considered.
Researcher Affiliation	Academia	Marco Mussi EMAIL Alberto Maria Metelli EMAIL Politecnico di Milano Piazza Leonardo da Vinci 32, Milan, 20133, Italy
Pseudocode	Yes	Algorithm 1 UCB1 (Auer et al., 2002; Bubeck, 2010). Require: number of arms K, exploration parameter a > 2, subgaussianity parameter σ Ni ← 0, µiˆ ← 0, U CBi ← ∞, ∀i ∈ [K] for t ∈ [T] do Select It ∈ arg max i∈[K] U CBi Play It and observe reward Xt Update µˆIt ← µˆIt NIt Xt NIt + 1 , NIt ← NIt + 1 Compute U CBi ← µˆi + σ √ a log t Ni , ∀i ∈ [K] end for Algorithm 2 MOSS (Audibert and Bubeck, 2009, 2010). Require: number of arms K, learning horizon T, subgaussianity parameter σ Ni ← 0, µiˆ ← 0, U CBi ← ∞, ∀i ∈ [K] for t ∈ [T] do Select It ∈ arg max i∈[K] U CBi Play It and observe reward Xt Update µˆIt ← µˆIt NIt Xt NIt + 1 , NIt ← NIt + 1 Compute U CBIt ← µˆIt + σ √ 4 NIt log T K NIt where log(x) ∆ log(max{1, x}) end for
Open Source Code	No	The paper does not contain any explicit statements or links regarding the availability of open-source code for the methodology described.
Open Datasets	No	The paper uses a 'bandit made of K = 10 arms' for numerical examples but does not provide concrete access information (link, DOI, repository, formal citation) for a publicly available or open dataset.
Dataset Splits	No	The paper's numerical examples are based on a simulated 'bandit made of K = 10 arms' and do not involve explicit training/test/validation dataset splits of a pre-existing dataset. No specific dataset split information is provided.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its numerical examples or experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library names with version numbers, used to implement or run the described algorithms or numerical examples.
Experiment Setup	No	While the paper mentions general parameters for numerical examples such as 'K = 10 arms' and 'different time horizons T ∈ {1·10^5, 5·10^5, 1·10^6}', it does not specify concrete hyperparameter values (e.g., the 'a' parameter for UCB1 mentioned in Algorithm 1) or other system-level training configurations used in the experiments.